In working to reduce the cost and time taken to generate Global Graph RAG results, I noted a marked improvement in ultimate results as measured by accuracy of results, cost and time. I’m passing this along since it is very much a win-win-win scenario.

Global Graph RAG Overview

Retrieval-Augmented Generation (RAG) over graphs, often called GraphRAG, is a way to enhance your RAG searches by focusing more on the relationships between items. Given our library analogy, if a RAG AI is giving the librarian a few select books to choose from based on a search of books with phrases semantically similar to our query, then Global GraphRAG would be like giving the librarian a summary of groups of books to review. These answers necessarily have greater breadth, as they are more likely to pull in different subjects, but with less detail, as using summaries necessarily removes detail.

The Common Approach

The Global GraphRAGs by Microsoft and Llama Index both operate by taking every query and passing it to an LLM along with each community one at a time. This means that if you have 200 communities, each query will need to make 201 calls to the LLM—once for each community and then once to amalgamate them into a single answer. This leads to accurate answers, but it is expensive from a pure cost and performance standpoint. Depending on how many concurrent calls you can make to the LLM and how fast they process, a query can take five minutes or more to process, making it not viable for real-time queries.

A Better Approach

I decided to see how much less accurate the results would be if, instead of passing the query to the LLM with a single community summary, I batched the summaries. In other words, send a group of communities with each query instead of just one.

I am a strong proponent of automating your RAG AI evaluations so you can easily change models and try new things. Therefore, I was in a good position to see what effect various levels of batching have on the evaluation score.

My hypothesis was that I would see a minor drop in the evaluation score but a huge decrease in time and cost. My goal was to find the point where the costs were minimized while maximizing the evaluation score.

The Surprising Findings

To my surprise, larger batches tended to increase the evaluation score! This meant that by batching, I was improving accuracy while decreasing both speed and cost. A real win-win-win.

Evaluation ScoreGPT-4oBatch of 1Batch of 5Batch of 10Batch of 25
Score799.29.5Error. Exceeded Max Tokens

Since the community summaries were already part of a Graph database, it was easy to cluster the summaries and pass them in in like groups. However, I found no difference when passing in clusters of like summaries or randomly passing clusters.

Conclusion

Using automated evaluation and trying some new approaches, I managed to make a major improvement in the evaluations for Global GraphRAG for my product. The reduction in cost alone is a significant benefit. I would strongly recommend everyone trying out GraphRAG to build in automated evaluation and see which level of batching works best for their implementation. By doing so, you may find, as I did, that you can achieve greater accuracy while also reducing time and costs, creating a more efficient and effective system.


Discover more from Lowry On Leadership

Subscribe to get the latest posts sent to your email.

One response to “Global Graph RAG – Enhanced Results by Combining Community Queries”

  1. […] and I have been relentlessly refining HGRAG’s performance. By enhancing the prompting methods and optimizing query batching, I’ve improved both the quality and cost-effectiveness of the results. The hybrid nature of […]

Leave a Reply

Quote of the week

“AI will probably most likely lead to the end of the world, but in the meantime, there’ll be great companies.”

~ Sam Altman (apocryphal)

Designed with WordPress

Discover more from Lowry On Leadership

Subscribe now to keep reading and get access to the full archive.

Continue reading