OpenAI GPT Cheatsheet | Nikhil Learn Hub Guide Pro

OpenAI & GPT Overview

OpenAI Platform

OpenAI API

Cloud-based platform providing access to state-of-the-art AI models including GPT, DALL-E, and Whisper.

Key Features

GPT Models: Text generation and understanding

DALL-E: Image generation from text

Whisper: Speech-to-text transcription

Embeddings: Text representation vectors

Moderation: Content safety filtering

Pricing Model

Pay-per-use based on tokens processed

Example: $0.03 per 1K tokens for GPT-4

Free tier available with limited usage

Getting Started: Sign up at platform.openai.com and get your API key to start building.

GPT Model Evolution

GPT-3 (2020)

175 billion parameters, first massively scalable model

Capabilities: Text completion, translation, Q&A

GPT-3.5 (2022)

Improved version with better instruction following

Models: text-davinci-003, gpt-3.5-turbo

GPT-4 (2023)

Multimodal model with image understanding

Features: Better reasoning, safer outputs

Context: 8K, 32K, 128K tokens

GPT-4 Turbo (2023)

Enhanced version with knowledge cutoff of April 2024

Improvements: Cheaper, larger context window

Current Recommendation: Use gpt-4-turbo for most applications due to cost-effectiveness and updated knowledge.

API Fundamentals

Basic API Setup

# Install OpenAI Python package

pip install openai

# Set up API key

import openai

openai.api_key = "sk-your-api-key-here"

# For newer versions (v1.0+)

from openai import OpenAI

client = OpenAI(api_key="sk-your-api-key-here")

# Environment variable (recommended)

import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

API Key Security

Never commit API keys to version control

Use environment variables or secret management

Rotate keys regularly

Set usage limits in OpenAI dashboard

Best Practice: Store API keys in environment variables or use a secrets manager for production applications.

Chat Completions API

# Basic chat completion

response = client.chat.completions.create(

    model="gpt-4",

    messages=[

        {"role": "system", "content": "You are a helpful assistant."},

        {"role": "user", "content": "Explain quantum computing in simple terms."}

    ],

    max_tokens=500,

    temperature=0.7

)

# Extract response

answer = response.choices[0].message.content

print(answer)

Message Roles

system: Set assistant behavior and context

user: User inputs and questions

assistant: Previous assistant responses

tool: Function call results

Key Parameters

model: Which GPT model to use

messages: Conversation history

max_tokens: Maximum response length

temperature: Creativity (0-2)

Advanced API Features

Function Calling

# Define functions for the model to call

functions = [

    {

        "name": "get_current_weather",

        "description": "Get the current weather in a location",

        "parameters": {

            "type": "object",

            "properties": {

                "location": {

                    "type": "string",

                    "description": "The city and state, e.g. San Francisco, CA"

                },

                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}

            },

            "required": ["location"]

        }

    }

]

# Call API with function definitions

response = client.chat.completions.create(

    model="gpt-4",

    messages=[{"role": "user", "content": "What's the weather in Boston?"}],

    functions=functions,

    function_call="auto"

)

Function Calling Benefits

Connect GPT to external APIs and databases

Execute code based on natural language requests

Build interactive applications with real-time data

Use Case: Weather apps, booking systems, data analysis tools that require real-time information.

Streaming & Async

# Streaming responses

response = client.chat.completions.create(

    model="gpt-4",

    messages=[{"role": "user", "content": "Tell me a long story"}],

    stream=True,

    max_tokens=1000

)

# Process stream

for chunk in response:

    if chunk.choices[0].delta.content is not None:

        print(chunk.choices[0].delta.content, end="", flush=True)

# Async usage

import asyncio

async def get_completion():

    response = await client.chat.completions.create(

        model="gpt-4",

        messages=[{"role": "user", "content": "Hello!"}],

        max_tokens=100

    )

    return response.choices[0].message.content

# Run async function

result = asyncio.run(get_completion())

When to Use Streaming

Chat applications for real-time typing effect

Long responses to show progress

Reducing perceived latency

Parameters & Tuning

Key Parameters

Temperature (0-2)

Low (0-0.3): Deterministic, consistent outputs

Medium (0.5-0.7): Balanced creativity and consistency

High (0.8-2.0): Creative, diverse outputs

Default: 1.0

Max Tokens

Maximum number of tokens to generate in response

Considerations: Model context window, cost control

GPT-4 Turbo: Up to 128,000 tokens context

Top P (0-1)

Nucleus sampling - considers tokens comprising top_p probability mass

Alternative to temperature sampling

Default: 1.0

Frequency & Presence Penalty

Frequency penalty: Reduces repetition of the same lines

Presence penalty: Encourages new topics

Both range from -2.0 to 2.0

# Optimized parameter settings

response = client.chat.completions.create(

    model="gpt-4",

    messages=messages,

    temperature=0.7,

    max_tokens=500,

    top_p=0.9,

    frequency_penalty=0.5,

    presence_penalty=0.3

)

Prompt Engineering

System Message Best Practices

Define the assistant's role and personality

Set constraints and guidelines

Provide examples of desired behavior

# Effective system message

system_message = """You are an expert Python programmer and educator. 

Your responses should be:

- Clear and concise

- Include code examples when relevant

- Explain complex concepts in simple terms

- Focus on best practices and readability"""

messages = [

    {"role": "system", "content": system_message},

    {"role": "user", "content": "Explain list comprehensions in Python"}

]

Few-Shot Learning

Provide examples in the conversation to guide the model

Show the format you want the response in

Demonstrate the reasoning process

# Few-shot example

messages = [

    {"role": "system", "content": "Extract product information from user queries."},

    {"role": "user", "content": "I want to buy a red laptop under $1000"},

    {"role": "assistant", "content": '{"category": "laptop", "color": "red", "max_price": 1000}'},

    {"role": "user", "content": "Find me a blue smartphone with 5G"}

]

Tip: Chain-of-thought prompting (asking the model to think step by step) significantly improves reasoning tasks.

Other OpenAI APIs

DALL-E Image Generation

# Generate images with DALL-E

response = client.images.generate(

    model="dall-e-3",

    prompt="A cute cat astronaut floating in space, digital art",

    size="1024x1024",

    quality="standard",

    n=1,

)

# Get image URL

image_url = response.data[0].url

print(f"Generated image: {image_url}")

DALL-E Parameters

model: dall-e-2 or dall-e-3

size: 256x256, 512x512, 1024x1024, 1792x1024, 1024x1792

quality: standard or hd (dall-e-3 only)

style: natural or vivid (dall-e-3 only)

Image Editing

Edit existing images with masks

Create variations of images

Upscale low-resolution images

# Create image variations

response = client.images.create_variation(

    image=open("original.png", "rb"),

    n=2,

    size="1024x1024"

)

Whisper & Embeddings

# Speech to text with Whisper

audio_file = open("speech.mp3", "rb")

transcription = client.audio.transcriptions.create(

    model="whisper-1",

    file=audio_file,

    response_format="text"

)

print(transcription)

# Text embeddings

response = client.embeddings.create(

    model="text-embedding-ada-002",

    input="Your text string goes here"

)

# Get embedding vector

embedding = response.data[0].embedding

print(f"Embedding length: {len(embedding)}")

Embedding Use Cases

Semantic search: Find similar documents

Clustering: Group similar items

Recommendations: Suggest similar content

Anomaly detection: Identify outliers

Whisper Features

Multilingual speech recognition

Translation to English

Timestamp generation

Speaker diarization (experimental)

Fine-tuning & Custom Models

Fine-tuning Process

When to Fine-tune

Specialized domain knowledge required

Consistent output format needed

Improved performance on specific tasks

Reduced latency for frequent queries

# Prepare training data

training_data = [

    {

        "messages": [

            {"role": "system", "content": "You are a helpful legal assistant."},

            {"role": "user", "content": "What is contract law?"},

            {"role": "assistant", "content": "Contract law governs..."}

        ]

    }

    # ... more examples

]

# Save as JSONL file

import json

with open("training_data.jsonl", "w") as f:

    for item in training_data:

        f.write(json.dumps(item) + "\n")

Fine-tuning Steps

1. Prepare training data in JSONL format

2. Upload file to OpenAI

3. Create fine-tuning job

4. Monitor training progress

5. Use fine-tuned model

Data Quality: Fine-tuning requires high-quality, consistent training data. Aim for 100+ examples for meaningful improvements.

Best Practices

Error Handling

Always implement proper error handling

Handle rate limits and quota exceeded

Implement retry logic with exponential backoff

# Robust API call with error handling

import time

from openai import OpenAIError, RateLimitError

def safe_completion(messages, max_retries=3):

    for attempt in range(max_retries):

        try:

            response = client.chat.completions.create(

                model="gpt-4",

                messages=messages,

                max_tokens=500

            )

            return response.choices[0].message.content

        except RateLimitError:

            if attempt == max_retries - 1:

                raise

            time.sleep(2 ** attempt)  # Exponential backoff

        except OpenAIError as e:

            print(f"OpenAI API error: {e}")

            return None

Cost Optimization

Use appropriate model for the task

Set reasonable max_tokens limits

Cache frequent responses

Monitor usage with OpenAI dashboard

Security & Privacy

Never send sensitive data to the API

Implement content moderation

Use data processing addendum for business use

Review OpenAI's data usage policies

Resources & Tools

Useful Resources

Official Documentation: platform.openai.com/docs
API Reference: platform.openai.com/docs/api-reference
Playground: platform.openai.com/playground
Cookbook: github.com/openai/openai-cookbook
Community Forum: community.openai.com
Pricing: openai.com/pricing
Status: status.openai.com

Development Tools

OpenAI Python Library: github.com/openai/openai-python
OpenAI Node.js Library: github.com/openai/openai-node
LangChain: Framework for LLM applications
LlamaIndex: Data framework for LLMs
Streamlit: For building AI web apps
Gradio: For creating AI demos
Weights & Biases: For experiment tracking

OpenAI & GPT Cheatsheet

OpenAI & GPT Overview

OpenAI Platform

GPT Model Evolution

API Fundamentals

Basic API Setup

Chat Completions API

Advanced API Features

Function Calling

Streaming & Async

Parameters & Tuning

Key Parameters

Prompt Engineering

Other OpenAI APIs

DALL-E Image Generation

Whisper & Embeddings

Fine-tuning & Custom Models

Fine-tuning Process

Best Practices

Resources & Tools

Useful Resources

Development Tools