OpenAI & GPT Overview

OpenAI Platform

OpenAI API

Cloud-based platform providing access to state-of-the-art AI models including GPT, DALL-E, and Whisper.

Key Features

GPT Models: Text generation and understanding

DALL-E: Image generation from text

Whisper: Speech-to-text transcription

Embeddings: Text representation vectors

Moderation: Content safety filtering

Pricing Model

Pay-per-use based on tokens processed

Example: $0.03 per 1K tokens for GPT-4

Free tier available with limited usage

Getting Started: Sign up at platform.openai.com and get your API key to start building.

GPT Model Evolution

GPT-3 (2020)

175 billion parameters, first massively scalable model

Capabilities: Text completion, translation, Q&A

GPT-3.5 (2022)

Improved version with better instruction following

Models: text-davinci-003, gpt-3.5-turbo

GPT-4 (2023)

Multimodal model with image understanding

Features: Better reasoning, safer outputs

Context: 8K, 32K, 128K tokens

GPT-4 Turbo (2023)

Enhanced version with knowledge cutoff of April 2024

Improvements: Cheaper, larger context window

Current Recommendation: Use gpt-4-turbo for most applications due to cost-effectiveness and updated knowledge.

API Fundamentals

Basic API Setup

# Install OpenAI Python package
pip install openai

# Set up API key
import openai
openai.api_key = "sk-your-api-key-here"

# For newer versions (v1.0+)
from openai import OpenAI
client = OpenAI(api_key="sk-your-api-key-here")

# Environment variable (recommended)
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
API Key Security

Never commit API keys to version control

Use environment variables or secret management

Rotate keys regularly

Set usage limits in OpenAI dashboard

Best Practice: Store API keys in environment variables or use a secrets manager for production applications.

Chat Completions API

# Basic chat completion
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
max_tokens=500,
temperature=0.7
)

# Extract response
answer = response.choices[0].message.content
print(answer)
Message Roles

system: Set assistant behavior and context

user: User inputs and questions

assistant: Previous assistant responses

tool: Function call results

Key Parameters

model: Which GPT model to use

messages: Conversation history

max_tokens: Maximum response length

temperature: Creativity (0-2)

Advanced API Features

Function Calling

# Define functions for the model to call
functions = [
{
"name": "get_current_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
]

# Call API with function definitions
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What's the weather in Boston?"}],
functions=functions,
function_call="auto"
)
Function Calling Benefits

Connect GPT to external APIs and databases

Execute code based on natural language requests

Build interactive applications with real-time data

Use Case: Weather apps, booking systems, data analysis tools that require real-time information.

Streaming & Async

# Streaming responses
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Tell me a long story"}],
stream=True,
max_tokens=1000
)

# Process stream
for chunk in response:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)
# Async usage
import asyncio

async def get_completion():
response = await client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=100
)
return response.choices[0].message.content

# Run async function
result = asyncio.run(get_completion())
When to Use Streaming

Chat applications for real-time typing effect

Long responses to show progress

Reducing perceived latency

Parameters & Tuning

Key Parameters

Temperature (0-2)

Low (0-0.3): Deterministic, consistent outputs

Medium (0.5-0.7): Balanced creativity and consistency

High (0.8-2.0): Creative, diverse outputs

Default: 1.0

Max Tokens

Maximum number of tokens to generate in response

Considerations: Model context window, cost control

GPT-4 Turbo: Up to 128,000 tokens context

Top P (0-1)

Nucleus sampling - considers tokens comprising top_p probability mass

Alternative to temperature sampling

Default: 1.0

Frequency & Presence Penalty

Frequency penalty: Reduces repetition of the same lines

Presence penalty: Encourages new topics

Both range from -2.0 to 2.0

# Optimized parameter settings
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
temperature=0.7,
max_tokens=500,
top_p=0.9,
frequency_penalty=0.5,
presence_penalty=0.3
)

Prompt Engineering

System Message Best Practices

Define the assistant's role and personality

Set constraints and guidelines

Provide examples of desired behavior

# Effective system message
system_message = """You are an expert Python programmer and educator.
Your responses should be:
- Clear and concise
- Include code examples when relevant
- Explain complex concepts in simple terms
- Focus on best practices and readability"""

messages = [
{"role": "system", "content": system_message},
{"role": "user", "content": "Explain list comprehensions in Python"}
]
Few-Shot Learning

Provide examples in the conversation to guide the model

Show the format you want the response in

Demonstrate the reasoning process

# Few-shot example
messages = [
{"role": "system", "content": "Extract product information from user queries."},
{"role": "user", "content": "I want to buy a red laptop under $1000"},
{"role": "assistant", "content": '{"category": "laptop", "color": "red", "max_price": 1000}'},
{"role": "user", "content": "Find me a blue smartphone with 5G"}
]
Tip: Chain-of-thought prompting (asking the model to think step by step) significantly improves reasoning tasks.

Other OpenAI APIs

DALL-E Image Generation

# Generate images with DALL-E
response = client.images.generate(
model="dall-e-3",
prompt="A cute cat astronaut floating in space, digital art",
size="1024x1024",
quality="standard",
n=1,
)

# Get image URL
image_url = response.data[0].url
print(f"Generated image: {image_url}")
DALL-E Parameters

model: dall-e-2 or dall-e-3

size: 256x256, 512x512, 1024x1024, 1792x1024, 1024x1792

quality: standard or hd (dall-e-3 only)

style: natural or vivid (dall-e-3 only)

Image Editing

Edit existing images with masks

Create variations of images

Upscale low-resolution images

# Create image variations
response = client.images.create_variation(
image=open("original.png", "rb"),
n=2,
size="1024x1024"
)

Whisper & Embeddings

# Speech to text with Whisper
audio_file = open("speech.mp3", "rb")
transcription = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
response_format="text"
)
print(transcription)
# Text embeddings
response = client.embeddings.create(
model="text-embedding-ada-002",
input="Your text string goes here"
)

# Get embedding vector
embedding = response.data[0].embedding
print(f"Embedding length: {len(embedding)}")
Embedding Use Cases

Semantic search: Find similar documents

Clustering: Group similar items

Recommendations: Suggest similar content

Anomaly detection: Identify outliers

Whisper Features

Multilingual speech recognition

Translation to English

Timestamp generation

Speaker diarization (experimental)

Fine-tuning & Custom Models

Fine-tuning Process

When to Fine-tune

Specialized domain knowledge required

Consistent output format needed

Improved performance on specific tasks

Reduced latency for frequent queries

# Prepare training data
training_data = [
{
"messages": [
{"role": "system", "content": "You are a helpful legal assistant."},
{"role": "user", "content": "What is contract law?"},
{"role": "assistant", "content": "Contract law governs..."}
]
}
# ... more examples
]

# Save as JSONL file
import json
with open("training_data.jsonl", "w") as f:
for item in training_data:
f.write(json.dumps(item) + "\n")
Fine-tuning Steps

1. Prepare training data in JSONL format

2. Upload file to OpenAI

3. Create fine-tuning job

4. Monitor training progress

5. Use fine-tuned model

Data Quality: Fine-tuning requires high-quality, consistent training data. Aim for 100+ examples for meaningful improvements.

Best Practices

Error Handling

Always implement proper error handling

Handle rate limits and quota exceeded

Implement retry logic with exponential backoff

# Robust API call with error handling
import time
from openai import OpenAIError, RateLimitError

def safe_completion(messages, max_retries=3):
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="gpt-4",
messages=messages,
max_tokens=500
)
return response.choices[0].message.content
except RateLimitError:
if attempt == max_retries - 1:
raise
time.sleep(2 ** attempt) # Exponential backoff
except OpenAIError as e:
print(f"OpenAI API error: {e}")
return None
Cost Optimization

Use appropriate model for the task

Set reasonable max_tokens limits

Cache frequent responses

Monitor usage with OpenAI dashboard

Security & Privacy

Never send sensitive data to the API

Implement content moderation

Use data processing addendum for business use

Review OpenAI's data usage policies

Resources & Tools

Useful Resources

  • Official Documentation: platform.openai.com/docs
  • API Reference: platform.openai.com/docs/api-reference
  • Playground: platform.openai.com/playground
  • Cookbook: github.com/openai/openai-cookbook
  • Community Forum: community.openai.com
  • Pricing: openai.com/pricing
  • Status: status.openai.com

Development Tools

  • OpenAI Python Library: github.com/openai/openai-python
  • OpenAI Node.js Library: github.com/openai/openai-node
  • LangChain: Framework for LLM applications
  • LlamaIndex: Data framework for LLMs
  • Streamlit: For building AI web apps
  • Gradio: For creating AI demos
  • Weights & Biases: For experiment tracking