Knowledge Management via Document Classification and Ranking in Complex Technical Eco-System

Project: Research project

Project Details

Description

Current knowledge-related documents reside in various locations and systems. They include but are not limited to whitepaper, PowerPoint presentation, spreadsheets, internal notes, technical reports, and data/results/conclusions from prior experiments. A centralized repository (physical or virtual) to hold these documents will be designed. This repository will constitute the actual document corpus for the data mining and knowledge management techniques which will be developed in subsequent phases of the proposed research. The repository will be accessible through a unified data access layer (serving the data-mining and knowledge management algorithms) and through innovative user interfaces open to authorized users (meeting all of Intel’s stringent IP security requirements). A secondary “test corpus” (collection of documents) will also be generated for development purposes and to jump-start/facilitate the external research effort, until all IP and process level security aspects have fully been addressed. Based on prior work in the area of document classification by the principal investigator, we plan to conduct investigations in the following areas: Development of a comprehensive ontology for the semiconductor manufacturing processes at Intel. Development of document conversion techniques (file format conversions) and a suitable centralized document repository to hold and maintain the document corpus. Indexing and ranking of documents based on the developed ontology and novel, domain specific ranking/weighing criteria. Development of appropriate data science approaches including feature extraction, data mining and the development of domain specific document similarity metrics. Development of novel front end interfacing concepts (User Interfaces) for the system.
StatusFinished
Effective start/end date6/1/145/31/17

Funding

  • Semiconductor Research Corporation (2014-IN-2523)

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.