Ted Hisokawa
Sep 26, 2025 03:41
GitHub introduces a new Copilot embedding model, enhancing code search in VS Code with improved accuracy and efficiency, according to GitHub’s announcement.
GitHub has announced a significant upgrade to its Copilot tool, introducing a new embedding model that promises to enhance code search within Visual Studio Code (VS Code). This development aims to make code retrieval faster, more memory-efficient, and significantly more accurate, as detailed in a recent GitHub blog post.
Enhanced Code Retrieval
The new Copilot embedding model brings a 37.6% improvement in retrieval quality, doubling the throughput and reducing the index size by eight times. This means developers can expect more accurate code suggestions, faster response times, and reduced memory usage in VS Code. The model effectively provides the correct code snippets needed, minimizing irrelevant results.
Why the Upgrade Matters
Efficient code search is crucial for a seamless AI coding experience. Embeddings, which are vector representations, play a key role in retrieving semantically relevant code and natural language content. The improved embeddings result in higher retrieval quality, thereby enhancing the overall GitHub Copilot experience.
Technical Improvements
GitHub has trained and deployed this new model specifically for code and documentation, enhancing context retrieval for various Copilot modes. The update has shown significant improvements, with C# developers experiencing a 110.7% increase in code acceptance ratios and Java developers seeing a 113.1% rise.
Training and Evaluation
The model was optimized using contrastive learning techniques, such as InfoNCE loss and Matryoshka Representation Learning, to improve retrieval quality. A key aspect of the training involved using ‘hard negatives’—code examples that appear correct but are not—helping the model distinguish between nearly correct and actually correct code snippets.
Future Prospects
GitHub plans to expand its training and evaluation data to include more languages and repositories. The company is also refining its hard negative mining pipeline to enhance quality further, with goals to deploy larger, more accurate models leveraging the efficiency gains from this update.
This latest enhancement is a step towards making AI coding assistants more reliable and efficient for developers, promising a smarter and more dependable tool for everyday development.
Image source: Shutterstock
Source: https://blockchain.news/news/github-copilot-enhances-code-search-with-new-embedding-model