This section is for users who want to connect OpenHands to different LLMs.
Model Recommendations
Based on our evaluations of language models for coding tasks (using the SWE-bench dataset), we can provide some recommendations for model selection. Our latest benchmarking results can be found in this spreadsheet. Based on these findings and community feedback, these are the latest models that have been verified to work reasonably well with OpenHands:Cloud / API-Based Models
- anthropic/claude-sonnet-4-20250514 (recommended)
- anthropic/claude-sonnet-4-5-20250929 (recommended)
- openai/gpt-5-2025-08-07 (recommended)
- gemini/gemini-2.5-pro
- deepseek/deepseek-chat
- moonshot/kimi-k2-0711-preview
Local / Self-Hosted Models
- mistralai/devstral-small (20 May 2025) — also available through OpenRouter
- all-hands/openhands-lm-32b-v0.1 (31 March 2025) — also available through OpenRouter
Known Issues
Most current local and open source models are not as powerful. When using such models, you may see long
wait times between messages, poor responses, or errors about malformed JSON. OpenHands can only be as powerful as the
models driving it. However, if you do find ones that work, please add them to the verified list above.
LLM Configuration
The following can be set in the OpenHands UI through the Settings:LLM ProviderLLM ModelAPI KeyBase URL(throughAdvancedsettings)
-e:
LLM_API_VERSIONLLM_EMBEDDING_MODELLLM_EMBEDDING_DEPLOYMENT_NAMELLM_DROP_PARAMSLLM_DISABLE_VISIONLLM_CACHING_PROMPT
- Azure
- Groq
- Local LLMs with SGLang or vLLM
- LiteLLM Proxy
- Moonshot AI
- OpenAI
- OpenHands
- OpenRouter
Model Customization
LLM providers have specific settings that can be customized to optimize their performance with OpenHands, such as:- Custom Tokenizers: For specialized models, you can add a suitable tokenizer.
- Native Tool Calling: Toggle native function/tool calling capabilities.
API retries and rate limits
LLM providers typically have rate limits, sometimes very low, and may require retries. OpenHands will automatically retry requests if it receives a Rate Limit Error (429 error code). You can customize these options as you need for the provider you’re using. Check their documentation, and set the following environment variables to control the number of retries and the time between retries:LLM_NUM_RETRIES(Default of 4 times)LLM_RETRY_MIN_WAIT(Default of 5 seconds)LLM_RETRY_MAX_WAIT(Default of 30 seconds)LLM_RETRY_MULTIPLIER(Default of 2)
config.toml file:

