TY - GEN
T1 - Unlocking the archives of displacement and trauma
T2 - 13th Annual Archiving Conference
AU - Travis, Diane M.
AU - Lee, Myeong
AU - Rojas, Magdalena
AU - Gunn, Allison
AU - Nimkar, Anuj
AU - Jansen, Gregory
AU - Diakopoulos, Nicholas
AU - Marciano, Richard
N1 - Funding Information:
Finally, we found AdaBoost to overfit the NDCG score for low number of iterations during the validation process. This fact indicates the presence of label noise in the learning-to-rank datasets, according to experiments conducted by [13] using artificial data. We note here that noise might come either real noise in the labeling, or from the deficiency of the overly simplistic feature representation which is unable to capture nontrivial semantics between a query and document. As a future work, we plan to investigate the robustness of our method to label noise using synthetic data since this is an important issue in a learning-to-rank application: while noise due to labeling might be reduced simply by improving the consistency of the data, it is less trivial to obtain significantly Acknowledgments. This work was supported by the ANR-2010-COSI-002 grant of the French National Research Agency.
Publisher Copyright:
© 2016 Society for Imaging Science and Technology.
PY - 2016
Y1 - 2016
N2 - This paper describes innovative partnerships: university - federal agency (between the University of Maryland and the Office of Innovation at the National Archives and Records Administration - NARA) and university - industry (between the College of Information Studies or "iSchool" at the University of Maryland and Archive Analytics Solutions Ltd.) where we are developing automated scalable workflows that involve digitization, OCR, information extraction, and linking into interactive maps and graph databases, and where digital preservation and archiving are performed using an innovative NoSQL Cassandra-based archival catalog and NetApp-based peta-scale storage infrastructure. This is a contribution to linking sensitive dispersed cultural resources involving the archives of displacement and trauma.
AB - This paper describes innovative partnerships: university - federal agency (between the University of Maryland and the Office of Innovation at the National Archives and Records Administration - NARA) and university - industry (between the College of Information Studies or "iSchool" at the University of Maryland and Archive Analytics Solutions Ltd.) where we are developing automated scalable workflows that involve digitization, OCR, information extraction, and linking into interactive maps and graph databases, and where digital preservation and archiving are performed using an innovative NoSQL Cassandra-based archival catalog and NetApp-based peta-scale storage infrastructure. This is a contribution to linking sensitive dispersed cultural resources involving the archives of displacement and trauma.
UR - http://www.scopus.com/inward/record.url?scp=84992035032&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84992035032&partnerID=8YFLogxK
U2 - 10.2352/issn.2168-3204.2016.1.0.135
DO - 10.2352/issn.2168-3204.2016.1.0.135
M3 - Conference contribution
AN - SCOPUS:84992035032
T3 - Archiving 2016 - Final Program and Proceedings
SP - 135
EP - 139
BT - Archiving 2016 - Final Program and Proceedings
PB - Society for Imaging Science and Technology
Y2 - 19 April 2016 through 22 April 2016
ER -