Efforts to integrate contact network data with molecular surveillance data provide enormous promise for HIV tracking and intervention. However, the lack of tools to facilitate integrated molecular-social surveillance remains a substantial barrier to progress. For example, most contact network data only contains information on the immediate sexual and drug use partners of a single individual. Yet, the same partners can appear across the contact networks of multiple individuals. Therefore, partners must be matched across contact networks - a process called entity resolution (ER) - in order to provide an accurate view of the overall contact network structure. The process of ER currently requires either substantial resources to manually match individuals or considerable technological expertise in programming to more efficiently match individuals using probabilistic models. Accordingly, this project will 1) develop a machine learning algorithm to match individuals across personal contact networks and validate it using a large existing dataset of young men who have sex with men, and 2) create a graphical user interface to implement the algorithm as an add-on package to an existing tool for network data capture and processing (Network Canvas). The results of this project will provide an open-source and freely available tool that can drastically reduce barriers to matching individuals across contact networks, thereby providing researchers and public health officials with unencumbered access to the underlying structure of drug use and sexual networks, and a potent tool for integrating contact network data with molecular surveillance.
|Effective start/end date||9/30/17 → 9/29/20|
- National Library of Medicine (5R21LM012578-02)