Creating Your Own Images (Dockerfile)

Now we'll learn how to create your own custom Docker images using Dockerfiles. This is where Docker becomes really powerful!

Part 1: What is a Dockerfile?

Simple Definition

Dockerfile = A text file containing instructions to build a Docker image.

Think of it as:

Recipe Card (Dockerfile):
├── List ingredients (base image)
├── Preparation steps (RUN commands)
├── Add your items (COPY files)
├── Cooking instructions (CMD)
└── Final dish (your custom image)

Following recipe → Creates the dish
Reading Dockerfile → Builds the image

Another Analogy:

Construction Blueprint (Dockerfile):
├── Foundation type (FROM)
├── Building materials (RUN install packages)
├── Interior design (COPY your files)
├── Final touches (CMD)
└── Complete building (custom image)

Why Create Custom Images?

Instead of using existing images, you create custom ones to:

1. Package your own application
   └── Your code + environment together

2. Customize existing images
   └── Add tools/packages you need

3. Create reproducible environments
   └── Same setup everywhere

4. Share with team
   └── Everyone uses same environment

5. Deploy applications
   └── Production-ready packages

Dockerfile Basics

A Dockerfile is just a text file named Dockerfile (no extension).

Simple example:

FROM ubuntu:22.04
RUN apt-get update
RUN apt-get install -y python3
COPY app.py /app/
CMD ["python3", "/app/app.py"]

What this does:

Line 1: Start with Ubuntu 22.04 as base
Line 2: Update package lists
Line 3: Install Python 3
Line 4: Copy your app.py file into image
Line 5: Run your app when container starts

Part 2: Dockerfile Instructions - FROM

FROM - The Base Image

FROM = Starting point for your image

Syntax:

FROM image:tag

Examples:

# Start with Ubuntu
FROM ubuntu:22.04

# Start with Python already installed
FROM python:3.11

# Start with Node.js
FROM node:18

# Start with minimal Alpine Linux
FROM alpine:3.18

# Start from scratch (empty image)
FROM scratch

Understanding FROM

Every Dockerfile MUST start with FROM:

FROM ubuntu:22.04
# ↑
# This is always the first instruction
# (except for ARG, which we'll learn later)

Why use different base images?

Choose based on needs:

Need Python?
FROM python:3.11
└── Python already installed ✓

Need Node.js?
FROM node:18
└── Node.js already installed ✓

Want minimal size?
FROM alpine:3.18
└── Smallest base (5MB) ✓

Want full Ubuntu?
FROM ubuntu:22.04
└── More packages available ✓

Starting from scratch?
FROM scratch
└── Build everything yourself

Example: Different Base Images

Example 1: Start with Ubuntu, install Python

FROM ubuntu:22.04
RUN apt-get update && apt-get install -y python3
# Manual installation

Example 2: Start with Python already

FROM python:3.11
# Python already included! ✓

Which is better?

Option 2 (FROM python:3.11) because:
├── Python pre-configured correctly
├── Includes pip and other tools
├── Follows best practices
└── Less code to write

Unless you need specific Ubuntu features,
use official language images! ✓

Part 3: Dockerfile Instructions - RUN

RUN - Execute Commands During Build

RUN = Run commands when building the image

Syntax:

RUN command

Examples:

# Install packages (Ubuntu/Debian)
RUN apt-get update && apt-get install -y curl

# Install packages (Alpine)
RUN apk add --no-cache curl

# Install Python packages
RUN pip install flask

# Create directories
RUN mkdir -p /app/data

# Download files
RUN curl -O https://example.com/file.zip

RUN - Important Concepts

Each RUN creates a new layer:

FROM ubuntu:22.04
RUN apt-get update           # Layer 1
RUN apt-get install -y curl  # Layer 2
RUN apt-get install -y vim   # Layer 3

Better approach - combine commands:

FROM ubuntu:22.04
RUN apt-get update && \
    apt-get install -y curl vim
# Single layer ✓

Why combine?

Separate RUN commands:
├── Layer 1: apt-get update (50MB)
├── Layer 2: install curl (10MB)
├── Layer 3: install vim (30MB)
└── Total: 90MB + overhead

Combined RUN:
└── Layer 1: update + installs (60MB)
└── Total: 60MB (smaller!) ✓

RUN Examples for Different Languages

Python:

FROM python:3.11
RUN pip install flask sqlalchemy requests

Node.js:

FROM node:18
RUN npm install express mongoose

System packages:

FROM ubuntu:22.04
RUN apt-get update && apt-get install -y \
    git \
    curl \
    vim \
    && rm -rf /var/lib/apt/lists/*
# ↑ Cleanup to reduce size

Part 4: Dockerfile Instructions - COPY and ADD

COPY - Copy Files from Host to Image

COPY = Copy files from your computer into the image

Syntax:

COPY source destination

Examples:

# Copy single file
COPY app.py /app/

# Copy all files in current directory
COPY . /app/

# Copy multiple files
COPY app.py config.json /app/

# Copy directory
COPY ./src /app/src/

Understanding COPY

Visual:

Your Computer:              Docker Image:
┌─────────────────┐        ┌──────────────┐
│ my-project/     │        │              │
│ ├── app.py      │  COPY  │ /app/        │
│ ├── config.json │───────→│ ├── app.py   │
│ └── data/       │        │ └── config   │
└─────────────────┘        └──────────────┘

Example Dockerfile:

FROM python:3.11

# Create app directory in image
RUN mkdir /app

# Copy your files into image
COPY app.py /app/
COPY requirements.txt /app/

# Copy everything
COPY . /app/

ADD vs COPY

ADD = Like COPY but with extra features

# COPY: Just copies files
COPY app.py /app/

# ADD: Copies AND extracts archives
ADD archive.tar.gz /app/
# ↑ Automatically extracts!

# ADD: Can download from URL
ADD https://example.com/file.txt /app/

Which to use?

Use COPY (recommended):
├── Simpler
├── More predictable
└── Best practice ✓

Use ADD only when:
├── Need to extract archives
└── Need to download from URL

Docker best practice: Prefer COPY

Part 5: Dockerfile Instructions - WORKDIR

WORKDIR - Set Working Directory

WORKDIR = Set the current directory inside the image

Syntax:

WORKDIR /path/to/directory

Without WORKDIR:

FROM ubuntu:22.04
RUN mkdir /app
COPY app.py /app/
RUN cd /app && python3 app.py
# ↑ Must keep specifying /app

With WORKDIR:

FROM ubuntu:22.04
WORKDIR /app
# ↑ Set once

COPY app.py .
# . means current directory (/app)

RUN python3 app.py
# Already in /app directory

WORKDIR Benefits

Makes Dockerfile cleaner:

FROM python:3.11

# Set working directory
WORKDIR /app

# Now all commands run in /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

CMD ["python", "app.py"]
# Runs from /app directory automatically

WORKDIR creates directory if it doesn't exist:

FROM ubuntu:22.04
WORKDIR /app/data/logs
# ↑ Creates entire path automatically ✓

Part 6: Dockerfile Instructions - CMD and ENTRYPOINT

CMD - Default Command

CMD = Command to run when container starts

Syntax:

CMD ["executable", "param1", "param2"]

Examples:

# Run Python script
CMD ["python3", "app.py"]

# Start web server
CMD ["nginx", "-g", "daemon off;"]

# Run shell command
CMD ["echo", "Hello Docker!"]

Understanding CMD

CMD sets the default command:

FROM ubuntu:22.04
WORKDIR /app
COPY hello.py .
CMD ["python3", "hello.py"]

When you run container:

docker run myimage
# Automatically runs: python3 hello.py

Can be overridden:

docker run myimage echo "Different command"
# Runs: echo "Different command"
# (CMD is ignored)

CMD Formats

Three formats:

1. Exec form (recommended):

CMD ["python3", "app.py"]
# ↑ As JSON array

2. Shell form:

CMD python3 app.py
# ↑ As shell command

3. As parameters to ENTRYPOINT:

ENTRYPOINT ["python3"]
CMD ["app.py"]
# Together: python3 app.py

ENTRYPOINT - Fixed Command

ENTRYPOINT = Command that always runs (cannot be easily overridden)

Difference between CMD and ENTRYPOINT:

# Using CMD
CMD ["python3", "app.py"]

docker run myimage
# Runs: python3 app.py

docker run myimage ls
# Runs: ls (CMD ignored!)

# Using ENTRYPOINT
ENTRYPOINT ["python3", "app.py"]

docker run myimage
# Runs: python3 app.py

docker run myimage ls
# Runs: python3 app.py ls
#       ↑ Still runs ENTRYPOINT!

CMD + ENTRYPOINT Together

Powerful combination:

FROM python:3.11
WORKDIR /app
COPY app.py .

ENTRYPOINT ["python3"]
CMD ["app.py"]

Usage:

# Run default
docker run myimage
# Executes: python3 app.py

# Run different script
docker run myimage test.py
# Executes: python3 test.py
# ↑ ENTRYPOINT fixed, CMD replaced

Part 7: Dockerfile Instructions - ENV

ENV - Environment Variables

ENV = Set environment variables in the image

Syntax:

ENV KEY=VALUE

Examples:

# Set single variable
ENV APP_ENV=production

# Set multiple variables
ENV APP_ENV=production \
    DB_HOST=localhost \
    DB_PORT=5432

Using ENV

Example Dockerfile:

FROM python:3.11
WORKDIR /app

# Set environment variables
ENV PYTHONUNBUFFERED=1
ENV APP_PORT=8000
ENV DEBUG=False

COPY app.py .
CMD ["python", "app.py"]

In your Python code (app.py):

import os

port = os.getenv('APP_PORT')  # Gets 8000
debug = os.getenv('DEBUG')     # Gets False
print(f"Starting app on port {port}")

Override ENV at Runtime

You can override when running container:

docker run -e APP_PORT=9000 myimage
# APP_PORT is now 9000 (overrides Dockerfile ENV)

Part 8: Dockerfile Instructions - EXPOSE

EXPOSE - Document Which Ports are Used

EXPOSE = Tells Docker which ports the container listens on

Syntax:

EXPOSE port

Examples:

# Web server on port 80
EXPOSE 80

# Multiple ports
EXPOSE 80 443

# With protocol
EXPOSE 8080/tcp
EXPOSE 53/udp

Understanding EXPOSE

IMPORTANT: EXPOSE is just documentation!

EXPOSE does NOT:
├── Actually publish the port
├── Make port accessible from outside
└── Do port mapping

EXPOSE only:
└── Documents which ports app uses
└── Helps other developers understand

You still need -p when running:

docker run -p 8080:80 myimage
# ↑ This is what actually publishes the port

Example:

FROM nginx:latest
EXPOSE 80
# ↑ Documents: "nginx uses port 80"

Part 9: Creating Your First Dockerfile

Example 1: Simple Python Application

Let's create a real Dockerfile!

Step 1: Create project directory

mkdir my-python-app
cd my-python-app

Step 2: Create a simple Python app (app.py)

# app.py
print("Hello from Docker!")
print("This is my first containerized app!")

import time
while True:
    print("App is running...")
    time.sleep(5)

Step 3: Create Dockerfile

Create a file named Dockerfile (no extension):

# Use Python 3.11 as base image
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Copy Python script into container
COPY app.py .

# Run the application
CMD ["python", "app.py"]

Step 4: Build the image

docker build -t my-python-app .

Understanding the command:

docker build
    -t my-python-app    ← Tag (name) for the image
    .                   ← Build context (current directory)

What you'll see:

[+] Building 12.3s (8/8) FINISHED
 => [1/3] FROM python:3.11-slim
 => [2/3] WORKDIR /app
 => [3/3] COPY app.py .
 => exporting to image
 => => naming to my-python-app

Successfully built!

Step 5: Run your container

docker run my-python-app

Output:

Hello from Docker!
This is my first containerized app!
App is running...
App is running...
App is running...
...

🎉 Congratulations! You built your first Docker image!

Understanding the Build Process

What happened during docker build:

Step 1: Read Dockerfile
Step 2: FROM python:3.11-slim
        └── Download base image (if not present)

Step 3: WORKDIR /app
        └── Create /app directory in image

Step 4: COPY app.py .
        └── Copy your file into image

Step 5: CMD ["python", "app.py"]
        └── Set default command

Step 6: Create final image
        └── Tag it as "my-python-app"

Example 2: Python Web App with Dependencies

Let's create something more realistic!

Step 1: Create project structure

mkdir flask-app
cd flask-app

Step 2: Create app.py

# app.py
from flask import Flask

app = Flask(__name__)

@app.route('/')
def hello():
    return '<h1>Hello from Dockerized Flask!</h1>'

@app.route('/about')
def about():
    return '<h1>This is a Flask app running in Docker</h1>'

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Step 3: Create requirements.txt

Flask==3.0.0

Step 4: Create Dockerfile

# Use Python 3.11 slim image
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Copy requirements file
COPY requirements.txt .

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY app.py .

# Expose port 5000
EXPOSE 5000

# Run the application
CMD ["python", "app.py"]

Step 5: Build the image

docker build -t flask-app .

Step 6: Run the container

docker run -d -p 5000:5000 --name my-flask-app flask-app

Step 7: Test it

Open browser and go to:

http://localhost:5000
http://localhost:5000/about

You should see your Flask app running! 🎉

Understanding This Dockerfile

Let's break it down:

FROM python:3.11-slim
# Start with lightweight Python image

WORKDIR /app
# All subsequent commands run in /app

COPY requirements.txt .
# Copy dependencies list first
# Why first? For layer caching! (explained next)

RUN pip install --no-cache-dir -r requirements.txt
# Install Python packages
# --no-cache-dir = Don't save pip cache (smaller image)

COPY app.py .
# Copy application code

EXPOSE 5000
# Document that Flask uses port 5000

CMD ["python", "app.py"]
# Start Flask when container runs

Build Context and .dockerignore

Build Context = Files Docker can access during build

When you run:

docker build -t myapp .
#                      ↑
#                 Build context (current directory)

Docker sends all files in this directory to Docker daemon:

my-project/
├── app.py              ← Sent to Docker
├── requirements.txt    ← Sent to Docker
├── data.csv            ← Sent to Docker
├── old_backup.zip      ← Sent to Docker (unnecessary!)
└── node_modules/       ← Sent to Docker (huge, unnecessary!)

Using .dockerignore

Create .dockerignore file to exclude files:

# .dockerignore

# Python
__pycache__/
*.pyc
*.pyo
*.pyd
.Python
env/
venv/

# IDE
.vscode/
.idea/
*.swp
*.swo

# OS
.DS_Store
Thumbs.db

# Git
.git/
.gitignore

# Documentation
*.md
docs/

# Tests
tests/
*.test.py

# Data files (if not needed in image)
*.csv
*.xlsx
data/

Now build is faster:

Without .dockerignore:
└── Sends 500MB to Docker

With .dockerignore:
└── Sends 5MB to Docker ✓

Build time: Much faster! ✓

Part 10: Building Images - Best Practices

Best Practice 1: Layer Caching

Docker caches layers to speed up builds!

Bad example (slow rebuilds):

FROM python:3.11-slim
WORKDIR /app

# Copy everything
COPY . .

# Install dependencies
RUN pip install -r requirements.txt

CMD ["python", "app.py"]

Problem:

Change app.py:
        ↓
COPY . . changes (includes app.py)
        ↓
Cache invalidated
        ↓
Must re-run pip install (slow!) ✗

Good example (fast rebuilds):

FROM python:3.11-slim
WORKDIR /app

# Copy only requirements first
COPY requirements.txt .

# Install dependencies
RUN pip install -r requirements.txt

# Copy application code
COPY app.py .

CMD ["python", "app.py"]

Why better:

Change app.py:
        ↓
COPY requirements.txt . (unchanged)
        ↓
RUN pip install (cached! ✓)
        ↓
COPY app.py . (only this layer rebuilds)
        ↓
Fast rebuild! ✓

Rule: Copy files that change less frequently first!

Best Practice 2: Minimize Layers

Combine RUN commands:

Bad:

RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y vim
RUN apt-get install -y git
# 4 layers ✗

Good:

RUN apt-get update && apt-get install -y \
    curl \
    vim \
    git
# 1 layer ✓

Best Practice 3: Clean Up in Same Layer

Bad:

RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
# Cleanup in separate layer doesn't reduce image size! ✗

Good:

RUN apt-get update && \
    apt-get install -y curl && \
    rm -rf /var/lib/apt/lists/*
# Cleanup in same layer reduces size! ✓

Best Practice 4: Use Specific Tags

Bad:

FROM python:latest
# What version? Changes over time! ✗

Good:

FROM python:3.11-slim
# Specific version, predictable! ✓

Best Practice 5: Use .dockerignore

Always create .dockerignore to exclude unnecessary files!

Best Practice 6: Multi-line for Readability

Bad:

RUN apt-get update && apt-get install -y curl vim git wget htop

Good:

RUN apt-get update && apt-get install -y \
    curl \
    vim \
    git \
    wget \
    htop
# Easier to read and modify ✓

Practice Exercises

Let's practice creating Dockerfiles!

Exercise 1: Node.js Application

Create a simple Node.js app:

app.js:

const http = require('http');

const server = http.createServer((req, res) => {
    res.writeHead(200, {'Content-Type': 'text/html'});
    res.end('<h1>Hello from Node.js in Docker!</h1>');
});

server.listen(3000, '0.0.0.0', () => {
    console.log('Server running on port 3000');
});

Your task: Create Dockerfile

Hints:

Use FROM node:18
WORKDIR /app
COPY app.js .
EXPOSE 3000
CMD ["node", "app.js"]

Build and run:

docker build -t node-app .
docker run -d -p 3000:3000 node-app

Test: http://localhost:3000

Exercise 2: Static Website with Nginx

Create index.html:

<!DOCTYPE html>
<html>
<head>
    <title>My Docker Site</title>
</head>
<body>
    <h1>Welcome to my Dockerized website!</h1>
    <p>This is served by Nginx running in a Docker container.</p>
</body>
</html>

Your task: Create Dockerfile

Hints:

Use FROM nginx:alpine
Copy index.html to /usr/share/nginx/html/
EXPOSE 80

Build and run:

docker build -t my-website .
docker run -d -p 8080:80 my-website

Test: http://localhost:8080

Summary

What We Learned:

✅ What a Dockerfile is
✅ Dockerfile instructions:
   ├── FROM (base image)
   ├── RUN (execute commands)
   ├── COPY (copy files)
   ├── WORKDIR (set directory)
   ├── CMD (default command)
   ├── ENTRYPOINT (fixed command)
   ├── ENV (environment variables)
   └── EXPOSE (document ports)
✅ Building images with docker build
✅ Layer caching
✅ Best practices
✅ .dockerignore
✅ Created real applications