Type something to search...
DeepSeek

DeepSeek

DeepSeek is an artificial intelligence company founded in 2021 by AI experts from leading tech companies including Google, Microsoft, and OpenAI. The company is dedicated to developing safe, efficient, and cost-effective large language models with the goal of making AI technology more accessible and practical.

Company Development:

  • 2021: Company founded, began LLM research and development
  • 2023: Launched first commercial API service and DeepSeek-V1 model
  • 2024: Released DeepSeek-V2 and Coder-V2 series, gaining wide recognition
  • Present: Continuing model optimization and expanding application scenarios

The company offers three main model series:

DeepSeek-Coder-V2:

  • High-performance model optimized for code development
  • Excellent performance in code generation and understanding
  • Supports multiple programming languages and frameworks
  • Highest quality score among all DeepSeek models

DeepSeek-V2:

  • General-purpose large language model
  • Excels in reasoning, knowledge, and general tasks
  • Tied for highest quality model alongside DeepSeek-Coder-V2
  • Outstanding performance in scientific reasoning and quantitative analysis

DeepSeek-V2.5:

  • Latest generation general-purpose language model
  • Slightly improved output speed (17 tokens/second)
  • Maintains similar performance levels to previous generation
  • Optimized for practical application scenarios

Common features across all models:

  • Support for 128k large context window
  • Full Function Calling support
  • JSON mode output support
  • First token response time between 1.12-1.15 seconds
  • Unified pricing: $0.17 per million tokens (blended price)

Innovative features:

  • Context Caching on Disk technology
  • Up to 90% API cost savings on cache hits
  • Prompt cache pricing: $0.014/million tokens

Performance evaluation:

  • Excellent results in MMLU (reasoning and knowledge) tests
  • High scores in GPQA (scientific reasoning and knowledge) evaluation
  • Outstanding capabilities in mathematical reasoning (MATH) tests
  • Excellent performance in code evaluation (HumanEval)
  • Strong showing in LMSys Chatbot Arena ELO ratings

Technical Innovation and Research:

  • Multiple core technology patents
  • Research collaborations with top universities
  • Regular publication of technical research papers
  • Active participation in open-source community building

Corporate Vision: DeepSeek is committed to providing superior and economical AI solutions to global developers and enterprises through continuous technological innovation and model optimization. The company particularly emphasizes model practicality, reliability, and cost-effectiveness, striving to lower the barriers to AI application.

By striking a balance between model quality, performance, and pricing, DeepSeek is establishing a unique competitive advantage in the AI field while continuously promoting the democratization of AI technology.

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous version ...

DeepSeek V2.5
Deepseek
125K context $0.14/M input tokens $0.28/M output tokens

1. Introduction We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-eff ...

DeepSeek V3
DeepSeek
62.5K context $0.14/M input tokens $0.28/M output tokens

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported ...

DeepSeek V3
DeepSeek
62.5K context $0.14/M input tokens $0.28/M output tokens

DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The m ...

DeepSeek: DeepSeek R1 Distill Llama 70B
DeepSeek
128K context $0.23/M input tokens $0.69/M output tokens
FREE

DeepSeek R1 Distill Llama 70B is a distilled large language model based on Llama-3.3-70B-Instruct, using outputs from DeepSeek R1. The m ...

DeepSeek: R1 Distill Llama 70B (free)
DeepSeek
128K context $0 input tokens $0 output tokens

DeepSeek R1 Distill Llama 8B is a distilled large language model based on Llama-3.1-8B-Instruct, using outputs from DeepSeek R1. The mode ...

DeepSeek: R1 Distill Llama 8B
DeepSeek
31.25K context $0.04/M input tokens $0.04/M output tokens

DeepSeek R1 Distill Qwen 1.5B is a distilled large language model based on Qwen 2.5 Math 1.5B, using outputs from [DeepSeek R1](/deepseek/deepseek-r1 ...

DeepSeek: R1 Distill Qwen 1.5B
DeepSeek
128K context $0.18/M input tokens $0.18/M output tokens

DeepSeek R1 Distill Qwen 14B is a distilled large language model based on Qwen 2.5 14B, using outputs from [DeepSeek R1](/deepseek/d ...

DeepSeek: R1 Distill Qwen 14B
DeepSeek
62.5K context $0.15/M input tokens $0.15/M output tokens

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on Qwen 2.5 32B, using outputs from DeepSeek R1. It outperfo ...

DeepSeek: R1 Distill Qwen 32B
DeepSeek
128K context $0.12/M input tokens $0.18/M output tokens
20% OFF

DeepSeek-R1 1. Introduction We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (R ...

DeepSeek: R1
DeepSeek
160K context $3/M input tokens $8/M output tokens
FREE

DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully ...

DeepSeek: R1 (free)
DeepSeek
160K context $0 input tokens $0 output tokens

DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully ...

DeepSeek: R1 (nitro)
DeepSeek
160K context $3/M input tokens $8/M output tokens
Type something to search...