NASA Project Exploration & eXtraction a specialized search system designed to simplify access to NASA’s research and technology portfolio. The system transforms complex, heterogeneous datasets into a unified and searchable knowledge base, significantly improving the discoverability of scientific and technical work.
Technical Evolution & Performance #
The project successfully implemented a state-of-the-art retrieval hierarchy, progressing from simple text matching to a sophisticated hybrid system. The Hybrid Architecture represents the peak of performance by merging traditional keyword precision with AI-driven conceptual depth.
| Model Stage | Architecture Focus | Effectiveness (MAP) |
|---|---|---|
| Base | Simplistic lexical matching (Baseline) | 0.462 |
| Tuned | Custom schema & field boosting | 0.483 |
| Advanced | echnical synonymy & stop-word retention | 0.499 |
| Semantic | AI-powered vector embeddings (OpenAI) | 0.510 |
| Hybrid | RRF Fusion of Lexical + Semantic | 0.560 |
The Result: The Hybrid approach emerged as the most robust solution overall, achieving a 0.560 Mean Average Precision (MAP) by successfully “filling in the gaps” where keywords alone were insufficient
The Architecture #
Behind the scenes, a robust pipeline handles the unique challenges of specialized technical data.

- Intelligent ETL Pipeline: A custom Python-based pipeline was built to scrape project narratives directly from the NASA research portal.
- High-Coverage Extraction: The scraping operation achieved a 98.4% success rate, extracting detailed descriptions and maturity ratings for over 16,000 projects.
- Data Normalization: The system implements automated cleaning, including ISO 8601 date standardization and multi-strategy matching for technology taxonomies.
- Dual-Engine Storage: The architecture utilizes Apache Solr for indexing and MongoDB for persistent metadata storage.
- Semantic Understanding: High-dimensional OpenAI embeddings (3072-dim) are utilized to capture conceptual relevance even when exact terminology differs.
Evaluation Pipeline #
To ensure high-quality retrieval, a rigorous evaluation framework was implemented based on the standard Text Retrieval Conference (TREC) framework.

- TREC Pooling Methodology: To create a manageable ground truth from a massive collection, the system utilized a pooling strategy where the top 100 results from each retrieval model were merged to form a unified Judgment Pool.
- Hybrid Relevance Assessment: A two-stage model combined AI efficiency with human expertise.
- LLM-as-a-Judge: GPT-4o mini initially screened the judgment pool, providing graded relevance scores and textual justifications.
- Human Adjudication: “Uncertain” cases (scores between 40-60) were flagged for manual review, where human expertise resolved 48 borderline document-topic pairs to ensure final judgment credibility.
- Standard Metrics: Performance was benchmarked using standard IR metrics, including MAP, P@10, and nDCG.
Documentation #
Detailed information regarding the system’s architecture, methodology, and performance evaluations can be explored through the technical academic report and the source code.