2023年4月18日 星期二

[tech] openai chatGPT api rate limits : /v1/chat/completions

 

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.openai.com', port=443): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x124e78eb0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))


Rate Limit Advice | OpenAI Help Center https://help.openai.com/en/articles/6891753-rate-limit-advice

"Rate limits can be quantized, meaning they are enforced over shorter periods of time (e.g. 60,000 requests/minute may be enforced as 1,000 requests/second). Sending short bursts of requests or contexts (prompts+max_tokens) that are too long can lead to rate limit errors, even when you are technically below the rate limit per minute."


How can I fix it?

  • Include exponential back-off logic in your code. This will catch and retry failed requests.

  • For token limits

    • Reduce the max_tokens to match the size of your completions. Usage needs are estimated from this value, so reducing it will decrease the chance that you unexpectedly receive a rate limit error. For example, if your prompt creates completions around 400 tokens, the max_tokens value should be around the same size.

    • Optimize your prompts. You can do this by making your instructions shorter, removing extra words, and getting rid of extra examples. You might need to work on your prompt and test it after these changes to make sure it still works well. The added benefit of a shorter prompt is reduced cost to you. If you need help, let us know.

  • For request limits

    • Batch your prompts in an array. This will reduce the number of requests you need to make. The prompt parameter can hold up to 20 unique prompts.


Rate Limits - OpenAI API https://platform.openai.com/docs/guides/rate-limits/overview


沒有留言:

張貼留言