OpenAI & GPT Overview
OpenAI Platform
Cloud-based platform providing access to state-of-the-art AI models including GPT, DALL-E, and Whisper.
GPT Models: Text generation and understanding
DALL-E: Image generation from text
Whisper: Speech-to-text transcription
Embeddings: Text representation vectors
Moderation: Content safety filtering
Pay-per-use based on tokens processed
Example: $0.03 per 1K tokens for GPT-4
Free tier available with limited usage
GPT Model Evolution
175 billion parameters, first massively scalable model
Capabilities: Text completion, translation, Q&A
Improved version with better instruction following
Models: text-davinci-003, gpt-3.5-turbo
Multimodal model with image understanding
Features: Better reasoning, safer outputs
Context: 8K, 32K, 128K tokens
Enhanced version with knowledge cutoff of April 2024
Improvements: Cheaper, larger context window
API Fundamentals
Basic API Setup
pip install openai
# Set up API key
import openai
openai.api_key = "sk-your-api-key-here"
# For newer versions (v1.0+)
from openai import OpenAI
client = OpenAI(api_key="sk-your-api-key-here")
# Environment variable (recommended)
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
Never commit API keys to version control
Use environment variables or secret management
Rotate keys regularly
Set usage limits in OpenAI dashboard
Chat Completions API
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
max_tokens=500,
temperature=0.7
)
# Extract response
answer = response.choices[0].message.content
print(answer)
system: Set assistant behavior and context
user: User inputs and questions
assistant: Previous assistant responses
tool: Function call results
model: Which GPT model to use
messages: Conversation history
max_tokens: Maximum response length
temperature: Creativity (0-2)
Advanced API Features
Function Calling
functions = [
{
"name": "get_current_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
]
# Call API with function definitions
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What's the weather in Boston?"}],
functions=functions,
function_call="auto"
)
Connect GPT to external APIs and databases
Execute code based on natural language requests
Build interactive applications with real-time data
Streaming & Async
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Tell me a long story"}],
stream=True,
max_tokens=1000
)
# Process stream
for chunk in response:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)
import asyncio
async def get_completion():
response = await client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=100
)
return response.choices[0].message.content
# Run async function
result = asyncio.run(get_completion())
Chat applications for real-time typing effect
Long responses to show progress
Reducing perceived latency
Parameters & Tuning
Key Parameters
Low (0-0.3): Deterministic, consistent outputs
Medium (0.5-0.7): Balanced creativity and consistency
High (0.8-2.0): Creative, diverse outputs
Default: 1.0
Maximum number of tokens to generate in response
Considerations: Model context window, cost control
GPT-4 Turbo: Up to 128,000 tokens context
Nucleus sampling - considers tokens comprising top_p probability mass
Alternative to temperature sampling
Default: 1.0
Frequency penalty: Reduces repetition of the same lines
Presence penalty: Encourages new topics
Both range from -2.0 to 2.0
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
temperature=0.7,
max_tokens=500,
top_p=0.9,
frequency_penalty=0.5,
presence_penalty=0.3
)
Prompt Engineering
Define the assistant's role and personality
Set constraints and guidelines
Provide examples of desired behavior
system_message = """You are an expert Python programmer and educator.
Your responses should be:
- Clear and concise
- Include code examples when relevant
- Explain complex concepts in simple terms
- Focus on best practices and readability"""
messages = [
{"role": "system", "content": system_message},
{"role": "user", "content": "Explain list comprehensions in Python"}
]
Provide examples in the conversation to guide the model
Show the format you want the response in
Demonstrate the reasoning process
messages = [
{"role": "system", "content": "Extract product information from user queries."},
{"role": "user", "content": "I want to buy a red laptop under $1000"},
{"role": "assistant", "content": '{"category": "laptop", "color": "red", "max_price": 1000}'},
{"role": "user", "content": "Find me a blue smartphone with 5G"}
]
Other OpenAI APIs
DALL-E Image Generation
response = client.images.generate(
model="dall-e-3",
prompt="A cute cat astronaut floating in space, digital art",
size="1024x1024",
quality="standard",
n=1,
)
# Get image URL
image_url = response.data[0].url
print(f"Generated image: {image_url}")
model: dall-e-2 or dall-e-3
size: 256x256, 512x512, 1024x1024, 1792x1024, 1024x1792
quality: standard or hd (dall-e-3 only)
style: natural or vivid (dall-e-3 only)
Edit existing images with masks
Create variations of images
Upscale low-resolution images
response = client.images.create_variation(
image=open("original.png", "rb"),
n=2,
size="1024x1024"
)
Whisper & Embeddings
audio_file = open("speech.mp3", "rb")
transcription = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
response_format="text"
)
print(transcription)
response = client.embeddings.create(
model="text-embedding-ada-002",
input="Your text string goes here"
)
# Get embedding vector
embedding = response.data[0].embedding
print(f"Embedding length: {len(embedding)}")
Semantic search: Find similar documents
Clustering: Group similar items
Recommendations: Suggest similar content
Anomaly detection: Identify outliers
Multilingual speech recognition
Translation to English
Timestamp generation
Speaker diarization (experimental)
Fine-tuning & Custom Models
Fine-tuning Process
Specialized domain knowledge required
Consistent output format needed
Improved performance on specific tasks
Reduced latency for frequent queries
training_data = [
{
"messages": [
{"role": "system", "content": "You are a helpful legal assistant."},
{"role": "user", "content": "What is contract law?"},
{"role": "assistant", "content": "Contract law governs..."}
]
}
# ... more examples
]
# Save as JSONL file
import json
with open("training_data.jsonl", "w") as f:
for item in training_data:
f.write(json.dumps(item) + "\n")
1. Prepare training data in JSONL format
2. Upload file to OpenAI
3. Create fine-tuning job
4. Monitor training progress
5. Use fine-tuned model
Best Practices
Always implement proper error handling
Handle rate limits and quota exceeded
Implement retry logic with exponential backoff
import time
from openai import OpenAIError, RateLimitError
def safe_completion(messages, max_retries=3):
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
max_tokens=500
)
return response.choices[0].message.content
except RateLimitError:
if attempt == max_retries - 1:
raise
time.sleep(2 ** attempt) # Exponential backoff
except OpenAIError as e:
print(f"OpenAI API error: {e}")
return None
Use appropriate model for the task
Set reasonable max_tokens limits
Cache frequent responses
Monitor usage with OpenAI dashboard
Never send sensitive data to the API
Implement content moderation
Use data processing addendum for business use
Review OpenAI's data usage policies
Resources & Tools
Useful Resources
- Official Documentation: platform.openai.com/docs
- API Reference: platform.openai.com/docs/api-reference
- Playground: platform.openai.com/playground
- Cookbook: github.com/openai/openai-cookbook
- Community Forum: community.openai.com
- Pricing: openai.com/pricing
- Status: status.openai.com
Development Tools
- OpenAI Python Library: github.com/openai/openai-python
- OpenAI Node.js Library: github.com/openai/openai-node
- LangChain: Framework for LLM applications
- LlamaIndex: Data framework for LLMs
- Streamlit: For building AI web apps
- Gradio: For creating AI demos
- Weights & Biases: For experiment tracking