Testing DeepSeek V3: How to Choose an AI API

Pick the cheapest AI that meets your needs.

DeepSeek V3 has gotten a lot of press for being a very smart AI at a very low cost. Because cost per API call is crucial to one of my company’s products, I wanted to see how it stacks up. I used my automated AI analyzer —which compares various models—to put DeepSeek V3 through its paces.

The short answer is that DeepSeek V3 was impressive, but much slower, less accurate and even a bit more expensive, than the current top performer for my needs, gpt-4o-mini. Given the hype, I was surprised that answers were less accurate, but that could have to do with my very specific use case.

The Broader AI Landscape

The AI industry is fiercely competitive. While OpenAI and its ChatGPT garner the most headlines, there’s also Google’s Gemini, Anthropic’s Claude, Meta AI, plus a vast pool of open-source models—if you have the hardware and expertise to run them, that is. For many companies, though, self-hosting an open-source system is too steep a hurdle. Sometimes a simpler, cheaper hosted option is enough, provided it meets your requirements.

Observations on DeepSeek V3

1. Quality of Answers

My initial test results show DeepSeek V3 scoring slightly below my current comparison model, gpt-4o-mini. Despite claims it rivals more advanced models, my analyzer (which rates how well each AI meets my specific product needs) placed DeepSeek V3 closer to simpler AI solutions. In practice, GPT-4O-MINI offered more cohesive, reader-friendly answers, while DeepSeek’s replies—though correct—felt more compact and less polished. These minor stylistic differences accounted for an average one-point gap in my scoring system. This was still good enough that had DeepSeek V3 met my other requirements, in particular around speed, and been less costly I would have moved to it.

Query	Model	Response Time	Score (1–10)
Query #1	GPT-4O-MINI	12.16s	9
Query #1	DEEPSEEK	31.73s	8
Query #2	GPT-4O-MINI	11.93s	8
Query #2	DEEPSEEK	26.05s	9
Query #3	GPT-4O-MINI	14.15s	8
Query #3	DEEPSEEK	32.24s	7
Query #4	GPT-4O-MINI	16.08s	9
Query #4	DEEPSEEK	Error Encountered	N/A
Query #5	GPT-4O-MINI	17.79s	8
Query #5	DEEPSEEK	32.08s	9
Query #6	GPT-4O-MINI	15.07s	9
Query #6	DEEPSEEK	35.15s	8
Query #7	GPT-4O-MINI	15.17s	9
Query #7	DEEPSEEK	33.51s	8
Query #8	GPT-4O-MINI	13.81s	9
Query #8	DEEPSEEK	30.46s	8
Query #9	GPT-4O-MINI	14.96s	9
Query #9	DEEPSEEK	25.45s	8
Query #10	GPT-4O-MINI	13.70s	9
Query #10	DEEPSEEK	32.27s	9

Model	Avg. Response Time	Avg. Score
GPT-4O-MINI	14.48s	8.7
DEEPSEEK	30.99s*	8.22**

2. Cost

DeepSeek V3’s calls are slightly pricier than GPT-4O-MINI when factoring both input and output tokens, but both are still far cheaper than advanced models like GPT-o1.

Model	Context Length	Max Output Tokens	Input Price	Output Price
deepseek-chat	64K	8K	$0.14	$1.10
deepseek-reasoner	64K	8K	$0.14	$2.19
gpt-4o-mini	128K	16K	$0.15	$0.60
gpt-o1	128K	64K	$15.00	$60.00

3. Speed Concerns

DeepSeek V3’s average response time was 30.99 seconds, more than double Gpt-4o-mini’s 14.48s. That’s a big deal for user-facing apps, it says it supports streaming (but I couldn’t enable it), meaning responses come in a single chunk after a long wait. For purely internal scripts or batch jobs, this may not matter, but for anything interactive, it’s a real downside. Even with streaming, the speed is a serious concern.

4. Context Window and Max Tokens

DeepSeek V3’s 64K context and 8K max tokens are relatively small by modern standards. gpt-40-mini has a 128K context and 16K max tokens, while GPT-o1 can handle even more. For my RAG-based AI, which feeds large amounts of domain-specific data, DeepSeek V3 can only handle about 30% of what GPT-4O-MINI can, limiting its ability to reference comprehensive docs or produce lengthy outputs. For applications with shorter inputs and outputs, this would not be a concern at all.

5. No Visual Support

DeepSeek V3 can’t interpret images beyond extracting text from documents, so if you need robust vision or image-based analysis, it’s not the right choice.

6. Lack of Asynchronous Support

DeepSeek doesn’t appear to provide asynchronous capabilities for calls. That might be a problem if your architecture relies heavily on event-driven or multi-request concurrency, as you’ll need extra workarounds or risk bottlenecks.

7. It’s Open Source

On the plus side, you can theoretically download the model and run it yourself, which can be a huge cost saver if you have the right infrastructure. Otherwise, hosting your own large language model can be expensive in terms of hardware, power, and maintenance.

8. Security Considerations

DeepSeek V3 originates from a Chinese firm. As with all AI providers, you must handle proprietary data carefully, but international privacy laws can add extra complexity. Understand where data is stored and how your confidentiality clauses will hold up across borders. If you are hosting the open source model yourself, then this is obviously much less of a concern than if you are using DeepSeek’s’ platform.

9. Implementation Ease

Replacing your existing API calls with DeepSeek’s is straightforward. For simpler systems that don’t require advanced features, it can be dropped in without too much hassle.

10. API limits

DeepSeek V3 excels here by having no preset API limits. Most AIs have limits on how often the API can be called. If you are trying to run a production app then hitting the API limit means your product is unusable for a period of time. Some APIs, such as Grok, have very low limits that make them difficult to use. You can usually get granted some extra limit by contacting the company and pleading your case, but these limits can be cumbersome to manage.

Caveats

AI APIs can perform very differently on different tasks and you should rate it based on your specific application. While it DeepSeek V3 does not perform as well for my applications, others show DeepSeek V3 competing positively with OpenAI’s current flagship model, gpt-o1. It’s also notable that they claim DeepSeek V3 was created for 1/20^th or less the price of gpt-o1. That’s an amazing achievement and bodes well for the future of AI.

Balancing the Trade-Offs

DeepSeek V3’s greatest appeal is its low cost. If it gives good answers for your content, you don’t require lightning-fast response times, large context windows, or advanced streaming, it could deliver decent performance at a bargain. However, if you need high speed, big context capacities, or user-friendly streaming, you’ll probably outgrow it quickly.

For my user-focused product, speed and context capacity are critical, so DeepSeek V3 isn’t currently ideal, especially as it gives slightly less accurate answer at a slightly higher cost. Still, for an application that it produced higher quality answers for, it might suit an internal tool that prioritizes cost over interactivity. As always, test each AI for your specific use case.

My mantra: Pick the cheapest AI that meets your needs. Although DeepSeek V3 falls far short on speed and capacity, and slightly short on cost and accuracy for my application, it remains a compelling choice if you can accommodate those limitations. Whether it’s right for you depends on your goals, scaling needs, and budget. Meanwhile, the AI landscape keeps evolving—no single model remains superior for every scenario. I’ll keep testing as new models appear; with improvements to context, speed, accuracy and cost, DeepSeek could eventually challenge the current primary alternatives.

Affiliate Disclosure: This post contains affiliate links. As an Amazon Associate, I earn from qualifying purchases.

My Recommendation: I’ve been loving these RXBAR Protein Bars in Blueberry flavor. They taste great, fill me up, and have natural ingredients. High in protein, these bars keep me satisfied between meals. If you’re looking for a healthier snack option, give these a try!

Lowry On Leadership