Skip to main content
  1. Selected work/

NPEX

NASA Project Exploration & eXtraction a specialized search system designed to simplify access to NASA’s research and technology portfolio. The system transforms complex, heterogeneous datasets into a unified and searchable knowledge base, significantly improving the discoverability of scientific and technical work.

NPEX Demo Video


Technical Evolution & Performance
#

The project successfully implemented a state-of-the-art retrieval hierarchy, progressing from simple text matching to a sophisticated hybrid system. The Hybrid Architecture represents the peak of performance by merging traditional keyword precision with AI-driven conceptual depth.

Model Stage Architecture Focus Effectiveness (MAP)
Base Simplistic lexical matching (Baseline) 0.462
Tuned Custom schema & field boosting 0.483
Advanced echnical synonymy & stop-word retention 0.499
Semantic AI-powered vector embeddings (OpenAI) 0.510
Hybrid RRF Fusion of Lexical + Semantic 0.560

The Result: The Hybrid approach emerged as the most robust solution overall, achieving a 0.560 Mean Average Precision (MAP) by successfully “filling in the gaps” where keywords alone were insufficient

The Architecture
#

Behind the scenes, a robust pipeline handles the unique challenges of specialized technical data.

  • Intelligent ETL Pipeline: A custom Python-based pipeline was built to scrape project narratives directly from the NASA research portal.
  • High-Coverage Extraction: The scraping operation achieved a 98.4% success rate, extracting detailed descriptions and maturity ratings for over 16,000 projects.
  • Data Normalization: The system implements automated cleaning, including ISO 8601 date standardization and multi-strategy matching for technology taxonomies.
  • Dual-Engine Storage: The architecture utilizes Apache Solr for indexing and MongoDB for persistent metadata storage.
  • Semantic Understanding: High-dimensional OpenAI embeddings (3072-dim) are utilized to capture conceptual relevance even when exact terminology differs.

Evaluation Pipeline
#

To ensure high-quality retrieval, a rigorous evaluation framework was implemented based on the standard Text Retrieval Conference (TREC) framework.

  • TREC Pooling Methodology: To create a manageable ground truth from a massive collection, the system utilized a pooling strategy where the top 100 results from each retrieval model were merged to form a unified Judgment Pool.
  • Hybrid Relevance Assessment: A two-stage model combined AI efficiency with human expertise.
  • LLM-as-a-Judge: GPT-4o mini initially screened the judgment pool, providing graded relevance scores and textual justifications.
  • Human Adjudication: “Uncertain” cases (scores between 40-60) were flagged for manual review, where human expertise resolved 48 borderline document-topic pairs to ensure final judgment credibility.
  • Standard Metrics: Performance was benchmarked using standard IR metrics, including MAP, P@10, and nDCG.

Documentation
#

Detailed information regarding the system’s architecture, methodology, and performance evaluations can be explored through the technical academic report and the source code.