TL;DR
Basis Set Ventures needed to search continuously refreshed data on 10,000+ people and companies without the burden of managing embeddings or pipelines. Using Spice's data and AI platform, Basis Set investors can now run natural language searches directly against fresh datasets - ultimately delivering accurate, data-grounded insights that help them spot opportunities earlier and act faster.

Situation
Founded in 2017 by Dr. Lan Xuezhao, Basis Set Ventures is a San Francisco-based venture capital firm that targets investments in early-stage technology companies across the United States. Basis Set describes itself as an "AI-native venture fund” because of its strong internal use of AI to identify promising entrepreneurs and enterprises.
AI-driven technology has made it easier than ever for someone with an innovative idea to start a new company; this is great for entrepreneurs, but it makes it harder for venture capitalists to identify early prospects in which to invest.
“You don't have to be in Silicon Valley anymore,” says Lan Xuezhao, Ph.D., Basis Set Ventures Founding & Managing Partner. “You can start a company anywhere. You don't need to know anybody, it's just a very different game. The type of founder is different, the technical capabilities are different, and all of this makes it more difficult to find the people to invest in, who are the best at what they do.”
Basis Set saw this challenge as a data problem and created Pascal, their AI investment application for internal use. Pascal scours the internet looking for innovation, monitoring what’s happening on GitHub, Reddit, LinkedIn, X, and a number of other sources. It then uses its proprietary algorithms to track more opaque variables like community sentiment about code contributions or areas of traction.
“Pascal helps us identify early inflection points,” Dr. Xuezhao says. “Early identification of opportunities is essential. For example, we identified and invested in one promising company sourced by Pascal, when their valuation was $5 million and it is now worth $200 million.”
Challenge: Custom Searches Across Continuously Refreshed Data
Monitoring more than 10,000 individuals and companies, and with harvested updates pouring in from social media and other areas of the internet throughout the day, Basis Set needed to keep its data set continuously updated so its team of investors could monitor areas of interests in real time.
“One key challenge we had is that, due to the nature of our business, we need to keep our database extremely fresh,” says Rachel Wong, CTO & Partner at Basis Set. “That means pulling in updates to data every single day, both on people and company metrics.”
The company first tried building and managing their own vector-based search system, converting words and images into numerical vectors, that could then be matched for similarity.
“But we found for our use case, managing embeddings ourselves was impractical because we're updating our data daily, which means we'd have to update our embeddings daily too,” Wong says. “That obviously comes with some technical challenges and scalability challenges, so we were looking for an easier way to help our users search our database, which is when we found Spice AI.”
Basis Set also sought to make it as easy as possible for the company’s investment partners to query its evolving data set, which meant converting natural language to SQL queries.
“With our first version of Pascal, our users were giving us multiple inputs on the kinds of people they were looking for,” says Muhammad Ammad, Staff Engineer at Basis Set Ventures. “We would manually go in and search for those people and form a pipeline and give those people to the investors. To do this we had to continuously change our code because they were giving us new information every day, and we had to sync with them with a lot of back and forth, with different dimensions and criteria. We wanted to reduce our communication and make things self-serve to give them the independence to search the database however they needed to.”
The specificity of the searches - which can involve individual work histories, social media contributions, GitHub postings, momentum, community sentiment and a variety of other factors - would be a significant operational lift using traditional search tooling.
Solution: Basis Set Adopts Spice.ai Enterprise, Purpose-Built to Help Enterprises Ground AI in Data
To solve these challenges, Basis Set adopted Spice.ai Enterprise, an open source and cloud-deployable runtime that unifies query federation, acceleration, hybrid search, and LLM inference in one system.
Spice automatically runs multiple forms of queries against the Basis Set data - including schema and semantic interrogation, data sampling, and evaluation - and then converts the natural language searches to precise SQL queries, delivering the most accurate answer to return to the user.
The Basis Set data remains on the company’s dedicated cloud-based infrastructure and communicates with the Spice Compute Engine managed in Spice Cloud. With the Spice Platform, Basis Set can continually add to its data stores without managing embeddings manually or requiring other pre-search preparations.

“With Spice we can say things like: ‘Show me founders in the Bay Area who worked at Uber in 2018,’ click search, and have Spice on the backend do the search, which is really awesome,” shared Wong.
Benefits
Enabling Natural Language Queries Without Managing Embeddings
Basis Set needed to use natural language queries to make it easier for its investors to search precisely for the characteristics they sought. But, as noted earlier, they found managing the embedding process of converting human language to numerical representations too time consuming and expensive to keep up with perpetually changing data stores. “Our data is continuously being updated, which means it is always changing,” Ammad says. “With embedding, if even one character is changed, that whole embedding is out of date, and we have to make a new embedding, which is also pretty expensive because we have more than 100,000 people in our database.”
With just a few lines of configuration code, Basis Set was able to do away with managing embeddings and instead use the Spice Cloud Platform to convert natural language queries to SQL, enabling always-fresh searches across Basis Set’s data. Ammad shared, “With Spice AI, we no longer have to keep updating embeddings to enable natural language queries. Spice takes care of all of that, which is awesome.”
Eliminating embedding management has saved significant time for the company.
“We were wrangling with the embeddings for some weeks, and found it quite frustrating,” Wong says. “As soon as we deployed Spice, those problems were gone.”
Behind the scenes, Spice goes beyond transforming natural language into SQL queries; it also tests multiple queries to find the one that generates the most precise results.
“Spice knows all our data structure and all the data that is within, so it's able to form queries on the fly that will work on our database, so we don't have to manually code queries,” Ammad says. “With this foundation it knows what queries it should generate to get the response required by our investors. Previously, writing detailed queries could be very complicated and would take a lot of time and lot of trial and error. With Spice, it is now very simple to query our data, which is ever expanding.”
Eliminated Hallucinations
AI is famously prone to hallucinations, in which it authoritatively delivers false information. Spice mitigates hallucinations by grounding its AI with the actual data of Basis Set, and using SQL queries, search, and LLM tools as inputs to AI prompts.“We saw a lot of hallucinations when we were using the manual embedding approach,” says Wong. “One problem with our embedded data was that it didn’t understand the true sentiment. If we asked it to find people who worked at Dropbox in 2017, it could hallucinate and return people who actually worked at Box. False returns like that are counter-productive and lessen confidence in our tool.
Hallucinations were eliminated after deploying Spice.ai. “We like the way Spice AI has approached this problem set,” Wong says. “Spice AI grounds AI in our actual data, using SQL queries across all our data, which brings accuracy to probabilistic AI systems, which are very prone to hallucinations.”
Ease of Use & Deployment
“One of the great things about Spice is that it was truly plug and play,” Wong says. “We jumped on a call with the Spice team, set up the configuration files with a couple lines of setup code, and were able to integrate it into our system pretty much immediately. Our whole business team was impressed. We told them we were bringing in a new search engine for our Pascal application, and two days later everyone was using it.”
The ability of Spice AI to support natural language searches has proven to be popular throughout the company, including with its investment team. “Spice takes a huge lift off our investment team,” Wong continues. “Previously, our investors had to configure very granular filters and settings for the algorithms on our platform. Now they can do search using regular English sentiment, which is a lot more natural for our users. They're investors, they're more business minded, so it makes sense for them to be able to just type in some heuristics of groups of people or companies that they want to be tracking, for example pre-seed companies working on MCP, who were founded in 2017—just whatever combinations they want to search for. It’s a lot more natural for their workflow than having to go in and think about every tiny setting.”
Observability
Basis Set values the observability Spice provides into how its AI-powered SQL queries are processing. “Spice AI gives us observability, with which we can actually see what is happening under the hood, and if something is not as we expect it to be, we can see where we need to change the prompt and so on,” says Ammad.
Spice's observability capabilities stand in contrast to the opaqueness of other AI platform solutions.
"We can actually track and see the different SQL queries that it's trying, which is really cool,” Wong says. “There's a lot of observability here, which we love as a technical team. We don’t want to be constrained by AI operating in a black box. We really need to know what it's doing. This allows us to test it. Yes, it found these people who worked at a certain company at a certain time, or it identifies an unknown company on the verge of going big. We can validate its precision, adjust if needed, and benefit from the insights we need to guide our investments.”
Conclusion
With Spice, Basis Set transformed the way its investors interact with data. The platform removes the burden of managing embeddings, reduces hallucinations, and makes natural language search both accurate and real-time. This leads to faster insights, greater confidence, and a competitive edge in spotting the next generation of high-growth startups.
Getting Started with Spice
Interested in giving Spice a try? Check out the following resources:
- Sign up for Spice Cloud for free, or get started with Spice Open Source
- Book a demo
- Explore the Spice cookbooks and docs
Interested in working with Spice AI or looking to learn a little more about the work we do? We are always looking for our next big challenge. Book an introductory call via our Calendly. Take a deeper look at our enterprise offerings by visiting Spice.ai.