DeepSeek

DeepSeek

DeepSeek is an artificial intelligence company founded in 2021 by AI experts from leading tech companies including Google, Microsoft, and OpenAI. The company is dedicated to developing safe, efficient, and cost-effective large language models with the goal of making AI technology more accessible and practical.

Company Development:

2021: Company founded, began LLM research and development
2023: Launched first commercial API service and DeepSeek-V1 model
2024: Released DeepSeek-V2 and Coder-V2 series, gaining wide recognition
Present: Continuing model optimization and expanding application scenarios

The company offers three main model series:

DeepSeek-Coder-V2:

High-performance model optimized for code development
Excellent performance in code generation and understanding
Supports multiple programming languages and frameworks
Highest quality score among all DeepSeek models

DeepSeek-V2:

General-purpose large language model
Excels in reasoning, knowledge, and general tasks
Tied for highest quality model alongside DeepSeek-Coder-V2
Outstanding performance in scientific reasoning and quantitative analysis

DeepSeek-V2.5:

Latest generation general-purpose language model
Slightly improved output speed (17 tokens/second)
Maintains similar performance levels to previous generation
Optimized for practical application scenarios

Common features across all models:

Support for 128k large context window
Full Function Calling support
JSON mode output support
First token response time between 1.12-1.15 seconds
Unified pricing: $0.17 per million tokens (blended price)

Innovative features:

Context Caching on Disk technology
Up to 90% API cost savings on cache hits
Prompt cache pricing: $0.014/million tokens

Performance evaluation:

Excellent results in MMLU (reasoning and knowledge) tests
High scores in GPQA (scientific reasoning and knowledge) evaluation
Outstanding capabilities in mathematical reasoning (MATH) tests
Excellent performance in code evaluation (HumanEval)
Strong showing in LMSys Chatbot Arena ELO ratings

Technical Innovation and Research:

Multiple core technology patents
Research collaborations with top universities
Regular publication of technical research papers
Active participation in open-source community building

Corporate Vision: DeepSeek is committed to providing superior and economical AI solutions to global developers and enterprises through continuous technological innovation and model optimization. The company particularly emphasizes model practicality, reliability, and cost-effectiveness, striving to lower the barriers to AI application.

By striking a balance between model quality, performance, and pricing, DeepSeek is establishing a unique competitive advantage in the AI field while continuously promoting the democratization of AI technology.

DeepSeek: DeepSeek V3 0324

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the DeepSeek V3 m ...

DeepSeek 62.5K context $0.27/M input tokens $1.1/M output tokens

DeepSeek V2.5

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous version ...

Deepseek 125K context $0.14/M input tokens $0.28/M output tokens

FREE

DeepSeek: DeepSeek V3 0324 (free)

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the DeepSeek V3 m ...

DeepSeek 62.5K context $0 input tokens $0 output tokens

FREE

DeepSeek: DeepSeek V3.1 (free)

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase ...

DeepSeek 159.96K context $0 input tokens $0 output tokens

DeepSeek V3

1. Introduction We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-eff ...

DeepSeek 62.5K context $0.14/M input tokens $0.28/M output tokens

DeepSeek V3

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported ...

DeepSeek 62.5K context $0.14/M input tokens $0.28/M output tokens

FREE

DeepSeek: R1 0528 (free)

DeepSeek-R1 1. Introduction We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (R ...

DeepSeek 160K context $0 input tokens $0 output tokens

DeepSeek: DeepSeek R1 Distill Llama 70B

DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The m ...

DeepSeek 128K context $0.23/M input tokens $0.69/M output tokens

FREE

DeepSeek: R1 Distill Llama 70B (free)

DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The m ...

DeepSeek 128K context $0 input tokens $0 output tokens

DeepSeek: R1 Distill Llama 8B

DeepSeek R1 Distill Llama 8B is a distilled large language model based on Llama-3.1-8B-Instruct, using outputs from DeepSeek R1. The mode ...

DeepSeek 31.25K context $0.04/M input tokens $0.04/M output tokens

DeepSeek: R1 Distill Qwen 1.5B

DeepSeek R1 Distill Qwen 1.5B is a distilled large language model based on Qwen 2.5 Math 1.5B, using outputs from [DeepSeek R1](/deepseek/deepseek-r1 ...

DeepSeek 128K context $0.18/M input tokens $0.18/M output tokens

DeepSeek: R1 Distill Qwen 14B

DeepSeek R1 Distill Qwen 14B is a distilled large language model based on Qwen 2.5 14B, using outputs from [DeepSeek R1](/deepseek/d ...

DeepSeek 62.5K context $0.15/M input tokens $0.15/M output tokens

DeepSeek: R1 Distill Qwen 32B

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on Qwen 2.5 32B, using outputs from DeepSeek R1. It outperfo ...

DeepSeek 128K context $0.12/M input tokens $0.18/M output tokens

20% OFF

DeepSeek: R1

# Hot # Discount

DeepSeek-R1 1. Introduction We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (R ...

DeepSeek 160K context $3/M input tokens $8/M output tokens

FREE

DeepSeek: R1 (free)

DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully ...

DeepSeek 160K context $0 input tokens $0 output tokens

DeepSeek: R1 (nitro)

DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully ...

DeepSeek 160K context $3/M input tokens $8/M output tokens

DeepSeek: DeepSeek V3.2 Exp

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-gra ...

DeepSeek 160K context $0.27/M input tokens $0.4/M output tokens