Chapter 17: Error Handling
A pipeline that crashes with a clear error message is better than one that runs silently and produces wrong answers. The crash tells you exactly what went wrong and where. Silent bad data ships to a dashboard, and someone discovers the problem three weeks later in a board meeting.
Python stops a program the moment something goes wrong and prints a traceback — a detailed report of what happened and where. Your job as an engineer is not to prevent every error from occurring. It is to decide which errors to catch, which to let through, and which to raise deliberately when your code encounters something it should not accept.
17.1 What Are Exceptions?
Python errors fall into three categories:
| Type | When it appears | Example |
|---|---|---|
| Syntax error | Before the script runs | Missing colon, unclosed bracket |
| Runtime exception | While the script runs | Dividing by zero, opening a missing file |
| Logic error | Never — script runs fine, results are wrong | Using = instead of == in a condition |
Syntax errors Python catches before running a single line. Runtime exceptions Python raises mid-execution when something impossible is attempted. Logic errors Python never catches — they are yours to find.
This chapter covers runtime exceptions, which Python calls exceptions or errors interchangeably.
Reading a Traceback
records = [100, 200, 300]
print(records[5])
Traceback (most recent call last):
File "script.py", line 2, in <module>
print(records[5])
IndexError: list index out of range
Read a traceback from bottom to top:
- Bottom line — the exception type and message. This is what went wrong.
- Middle lines — the call stack. Where in the code it happened.
- Top line — always the same. Start from the bottom.
IndexError: list index out of range tells you everything — a list was accessed at a position that does not exist.
17.2 try and except
Wrap code that might fail in a try block. If an exception is raised, Python jumps to the except block instead of crashing.
try:
result = 100 / 0
except:
print("Something went wrong.")
Something went wrong.
The script did not crash. But this is the weakest form of error handling — a bare except catches everything, including errors you did not expect and should not be hiding. More on this in section 17.3.
try / except Flow
try block runs
│
├── no exception ──────────────────▶ continues normally
│
└── exception raised ──▶ except block runs ──▶ continues after
value = "forty-two"
try:
number = int(value)
print(f"Converted: {number}")
except ValueError:
print(f"Could not convert '{value}' to an integer.")
Could not convert 'forty-two' to an integer.
The script does not crash. The error is caught, a clear message is printed, and execution continues.
Try It 17.1 — Write a
try/exceptblock that attempts to open a file calledmissing.txtfor reading. Catch theFileNotFoundErrorand print a message explaining the file was not found. Confirm the script continues after the except block by printing "script complete" afterward.
17.3 Catching Specific Exceptions
Python has dozens of built-in exception types. Catching the right one makes your error handling precise and your code trustworthy.
Common Built-in Exceptions
| Exception | When it occurs |
|---|---|
ValueError | Right type, wrong value — int("hello") |
TypeError | Wrong type entirely — "text" + 5 |
KeyError | Dictionary key does not exist — d["missing"] |
IndexError | List index out of range — lst[99] |
FileNotFoundError | File does not exist |
ZeroDivisionError | Division by zero |
AttributeError | Method or attribute does not exist on an object |
ImportError | Module not found |
Catching Multiple Exceptions
def convert_record(value):
try:
return int(value)
except ValueError:
print(f"Bad value: '{value}' is not a number.")
return None
except TypeError:
print(f"Bad type: expected a string, got {type(value).__name__}.")
return None
print(convert_record("250"))
print(convert_record("abc"))
print(convert_record(None))
250
Bad value: 'abc' is not a number.
Bad type: expected a string, got NoneType.
None
Each exception type gets its own except block with its own response. A ValueError and a TypeError deserve different messages — and different fixes.
You can also catch multiple exceptions in one block when the response is the same:
try:
result = int(value) / count
except (ValueError, ZeroDivisionError) as e:
print(f"Could not calculate result: {e}")
The as e captures the exception object, giving you access to its message.
⚠ Common Mistake — Bare
exceptHides Real Bugstry:process_file("data.csv")except:print("Something failed.")A bare
exceptcatches absolutely everything — includingKeyboardInterrupt(Ctrl+C),MemoryError, and bugs you introduced inprocess_file()itself. You will never know what actually went wrong. Always name the exception you expect:except FileNotFoundError:print("data.csv was not found.")
Try It 17.2 — Write a function
safe_divide(a, b)that returns the result ofa / b. CatchZeroDivisionErrorand returnNonewith a printed message. CatchTypeErrorand returnNonewith a different message. Test it with three different inputs.
17.4 else and finally
else — When No Exception Occurs
The else block runs only if the try block completed without raising any exception. Use it to separate "the risky part" from "what happens when it works."
filename = "report.txt"
try:
with open(filename, "r") as f:
content = f.read()
except FileNotFoundError:
print(f"File not found: {filename}")
else:
print(f"File loaded successfully — {len(content)} characters.")
File loaded successfully — 312 characters.
If the file is missing, the else block never runs:
File not found: report.txt
finally — Always Runs
The finally block runs whether an exception occurred or not. Use it for cleanup — closing connections, releasing resources, writing to a log.
def load_config(path):
print("Opening config...")
try:
with open(path, "r") as f:
data = f.read()
except FileNotFoundError:
print("Config file missing — using defaults.")
data = "{}"
finally:
print("Config load attempt complete.")
return data
load_config("settings.json")
Opening config...
Config file missing — using defaults.
Config load attempt complete.
finally ran even though an exception occurred. This is the guarantee — finally always runs, which makes it the right place for anything that must happen regardless of outcome.
The Full Flow
try:
│
├── success ──▶ else block ──▶ finally block ──▶ continues
│
└── exception ──▶ except block ──▶ finally block ──▶ continues
try:
result = int("500")
except ValueError:
print("Conversion failed.")
else:
print(f"Converted successfully: {result}")
finally:
print("Conversion attempt finished.")
Converted successfully: 500
Conversion attempt finished.
Try It 17.3 — Write a function that takes a filename and attempts to read it. Use
tryfor the open,except FileNotFoundErrorto handle the missing file case,elseto print the number of lines in the file, andfinallyto always print "read attempt done." Test it with both a real file and a missing one.
17.5 Raising Exceptions
Catching exceptions is one side of error handling. Raising them is the other. When your code receives input it cannot work with, raising an exception is the honest response — it tells the caller exactly what went wrong, immediately, rather than silently returning bad data.
raise
def set_batch_size(n):
if n <= 0:
raise ValueError(f"Batch size must be positive. Got: {n}")
return n
set_batch_size(100) # works fine
set_batch_size(-5) # raises immediately
ValueError: Batch size must be positive. Got: -5
def load_record(data, key):
if key not in data:
raise KeyError(f"Required field '{key}' is missing from record.")
return data[key]
record = {"name": "Alice", "status": "active"}
print(load_record(record, "name"))
print(load_record(record, "email"))
Alice
KeyError: "Required field 'email' is missing from record."
When to Raise vs When to Handle
| Situation | Approach |
|---|---|
| You wrote the function, caller passed bad input | raise — make the problem the caller's to fix |
| You are calling external code (file, API, database) | except — handle what you cannot control |
| Bad data should stop the pipeline | raise — fail loudly |
| Bad data on one row should be skipped | except — log and continue |
The most dangerous pattern in data engineering is catching an exception and doing nothing — no log, no re-raise, no return value that signals failure. Data continues flowing through a broken pipeline. Errors should either be fixed, logged clearly, or re-raised.
17.6 Custom Exceptions
For larger codebases, you can define your own exception types by subclassing Exception. This makes error handling more descriptive and lets callers catch your specific errors without catching everything else.
class ValidationError(Exception):
"""Raised when a data record fails validation."""
pass
class PipelineError(Exception):
"""Raised when a pipeline step fails unrecoverably."""
pass
def validate_row(row):
if not row.get("id"):
raise ValidationError(f"Record is missing required field 'id': {row}")
if row["id"] < 0:
raise ValidationError(f"Record has invalid id: {row['id']}")
return True
try:
validate_row({"name": "Alice", "status": "active"})
except ValidationError as e:
print(f"Validation failed: {e}")
Validation failed: Record is missing required field 'id': {'name': 'Alice', 'status': 'active'}
Custom exceptions cost almost nothing to create and pay back significantly in code that is easier to debug and maintain.
⚠ Common Mistake — Catching Too Broadly Then Continuing
try:process_batch(records)except Exception:pass # ← worst pattern in data engineering
passon a caught exception means your pipeline silently skips failures. Records are lost, counts are wrong, and there is no trace of what happened. At minimum, log the error. If the failure matters, re-raise it.
17.7 Putting It Together
This function reads a CSV of client records, validates each row, converts types, and collects errors separately from successes — a pattern used in real ingestion pipelines.
import csv
class ValidationError(Exception):
pass
def validate_and_parse(row):
"""Parse and validate a single CSV row. Raises ValidationError on bad data."""
required = ["id", "name", "contract_value", "status"]
for field in required:
if field not in row:
raise ValidationError(f"Missing field: '{field}'")
try:
row["id"] = int(row["id"])
row["contract_value"] = float(row["contract_value"])
except ValueError as e:
raise ValidationError(f"Type conversion failed: {e}")
if row["status"] not in ("active", "inactive"):
raise ValidationError(f"Invalid status: '{row['status']}'")
return row
def load_clients(filepath):
"""Load and validate a CSV of client records."""
valid = []
invalid = []
try:
f = open(filepath, "r")
except FileNotFoundError:
print(f"Error: file not found — {filepath}")
return [], []
with f:
reader = csv.DictReader(f)
for i, row in enumerate(reader, start=1):
try:
parsed = validate_and_parse(row)
valid.append(parsed)
except ValidationError as e:
invalid.append({"row": i, "reason": str(e), "data": dict(row)})
return valid, invalid
valid_records, errors = load_clients("clients.csv")
print(f"Loaded: {len(valid_records)} valid records")
print(f"Skipped: {len(errors)} invalid records")
if errors:
print("\nValidation errors:")
for err in errors:
print(f" Row {err['row']}: {err['reason']}")
Loaded: 4 valid records
Skipped: 1 invalid record
Validation errors:
Row 3: Type conversion failed: could not convert string to float: 'N/A'
One bad row does not crash the pipeline. It is caught, logged with the row number and reason, and the rest of the records continue loading. The calling code sees exactly which rows failed and why — Ayan can fix row 3 in the source file without re-running the entire batch.
Summary
Python raises exceptions when it encounters something it cannot execute. Read tracebacks from the bottom up — the last line is always the error type and message. Use try and except to handle exceptions gracefully; always name the specific exception type, never use a bare except. The else block runs when no exception occurs; finally runs always — use it for cleanup. Use raise to signal problems from your own code, with a message that explains exactly what went wrong. Custom exceptions give large codebases precise, searchable error types. The worst pattern is catching an exception and doing nothing — every caught error should be logged, handled, or re-raised.
Exercises
17.1 — Write a function parse_temperature(value) that converts a string to a float and raises a ValueError with a descriptive message if the conversion fails. Wrap a call to it in a try/except and print the result or the error cleanly.
17.2 — You have a dictionary of server configs. Write a function get_port(config, server_name) that returns the port for a given server. Raise a KeyError with a clear message if the server is not in the dictionary. Catch it at the call site and print a fallback message.
17.3 — The following code catches an exception and continues. Identify the problem with this approach and rewrite it with proper logging:
try:
records = load_from_database(query)
except:
records = []
17.4 — Write a function read_json_config(path) that opens and parses a JSON config file. Handle FileNotFoundError with a message saying the config is missing, json.JSONDecodeError with a message saying the file is not valid JSON, and return a default empty dictionary in both cases. Use else to print a success message.
17.5 — Create a custom exception BatchSizeError. Write a function create_batch(records, size) that raises BatchSizeError if size is less than 1 or greater than the number of records. Catch it at the call site and print the error message.
17.6 — Think About It: A data pipeline runs every night. It loads 10,000 records, and roughly 5 fail validation due to bad source data. The current code uses except: pass on each row, so the pipeline completes with a success status every night. The team discovers after six weeks that 2,100 records were silently dropped. What should the pipeline have done differently at each of these three points: when a row fails validation, at the end of each run, and when the failure rate crosses a threshold like 1%?