Background: Rapid identification of subject experts for medical topics helps in improving the implementation of discoveries by speeding the time to market drugs and aiding in clinical trial recruitment, etc. Identifying such people who influence opinion through social network analysis is gaining prominence. In this work, we explore how to combine named entity recognition from unstructured news articles with social network analysis to discover opinion leaders for a given medical topic. Methods: We employed a Conditional Random Field algorithm to extract three categories of entities from health-related new articles: Person, Organization and Location. We used the latter two to disambiguate polysemy and synonymy for the person names, used simple rules to identify the subject experts, and then applied social network analysis techniques to discover the opinion leaders among them based on their media presence. A network was created by linking each pair of subject experts who are mentioned together in an article. The social network analysis metrics (including centrality metrics such as Betweenness, Closeness, Degree and Eigenvector) are used for ranking the subject experts based on their power in information flow. Results: We extracted 734,204 person mentions from 147,528 news articles related to obesity from January 1, 2007 through July 22, 2010. Of these, 147,879 mentions have been marked as subject experts. The F-score of extracting person names is 88.5%. More than 80% of the subject experts who rank among top 20 in at least one of the metrics could be considered as opinion leaders in obesity. Conclusion: The analysis of the network of subject experts with media presence revealed that an opinion leader might have fewer mentions in the news articles, but a high network centrality measure and vice-versa. Betweenness, Closeness and Degree centrality measures were shown to supplement frequency counts in the task of finding subject experts. Further, opinion leaders missed in scientific publication network analysis could be retrieved from news articles.
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Health Informatics
- Computer Networks and Communications