Python for Operations
Python has become the lingua franca for IT automation. You don’t need to be a software engineer — you need to read, write, and debug small scripts that do one thing well. Network devices, cloud APIs, log parsing, Ansible modules — all of it is Python-adjacent.
Why Python for ops
- Readable — looks a lot like pseudocode; low cognitive load for someone who doesn’t code daily
- Batteries included — standard library covers HTTP, JSON, subprocess, files, regex, sockets
- Huge ecosystem — Netmiko, NAPALM, boto3 (AWS), Azure SDK, requests, paramiko — someone has already written the library for your task
- Cross-platform — runs on Linux, Windows, macOS
- Ansible, SaltStack, Netmiko, NAPALM are written in Python — network automation is Python-shaped
The minimum you need to read any Python script
Variables and types
name = "router-01" # string
port = 22 # int
online = True # bool
interfaces = ["eth0", "eth1"] # list
config = {"hostname": "r1", "port": 22} # dict (key-value map)Types are inferred — you don’t declare them. The : after variable names in some code is optional type hints, purely for the reader.
Control flow
if online and port == 22:
print("SSH reachable")
elif not online:
print("Down")
else:
print("Wrong port")
for iface in interfaces:
print(iface)
while retries > 0:
attempt()
retries -= 1Indentation is syntactic — no braces. Must be consistent (4 spaces is the convention).
Functions
def ping_host(host, timeout=5):
"""Ping a host; return True if alive."""
result = subprocess.run(["ping", "-c", "1", "-W", str(timeout), host])
return result.returncode == 0
if ping_host("8.8.8.8"):
print("Internet is up")def introduces a function. Default argument values (timeout=5) are powerful and common. The docstring (triple-quoted string) is convention.
Error handling
try:
response = requests.get(url, timeout=5)
response.raise_for_status()
data = response.json()
except requests.Timeout:
print("Timed out")
except requests.HTTPError as e:
print(f"HTTP error: {e}")try/except is Python’s if-the-error-happens mechanism. The pattern is: attempt the risky thing, handle specific expected failures, let unexpected ones propagate (or be caught by a broader except higher up).
f-strings (string formatting)
host = "r1"
port = 22
print(f"Connecting to {host}:{port}") # "Connecting to r1:22"The f before the quote means “fill in variables inside {}.” This replaces older %-formatting and .format().
Standard library modules you’ll use constantly
| Module | For |
|---|---|
subprocess | Run shell commands, capture output |
os / pathlib | Files, directories, env vars |
json | Parse / emit JSON |
re | Regular expressions |
requests (3rd party, but universal) | HTTP APIs |
argparse | Command-line arguments |
logging | Proper logs instead of print() |
datetime | Dates, times, durations |
csv | CSV parsing |
itertools / collections | Data-crunching helpers |
Virtual environments — the one thing everyone skips and regrets
Python projects should isolate their dependencies. Otherwise one script’s library version breaks another.
python3 -m venv .venv # create a virtual environment
source .venv/bin/activate # activate it (Linux/macOS)
pip install requests netmiko # install packages inside itWithout venvs, every pip install goes into your system Python and conflicts accumulate. Always use a venv. Modern equivalents: poetry, uv, pipenv.
A realistic ops script
#!/usr/bin/env python3
"""Check a list of hosts and report which are unreachable."""
import argparse
import subprocess
from concurrent.futures import ThreadPoolExecutor
def ping(host):
result = subprocess.run(
["ping", "-c", "1", "-W", "2", host],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
return host, result.returncode == 0
def main():
parser = argparse.ArgumentParser()
parser.add_argument("hosts_file")
args = parser.parse_args()
with open(args.hosts_file) as f:
hosts = [line.strip() for line in f if line.strip()]
with ThreadPoolExecutor(max_workers=20) as pool:
results = list(pool.map(ping, hosts))
for host, alive in results:
if not alive:
print(f"DOWN: {host}")
if __name__ == "__main__":
main()This is idiomatic ops Python: argparse for CLI, parallel pinging with ThreadPoolExecutor, a main() function, if __name__ == "__main__": guard so the file can also be imported.
Libraries for network & cloud ops
- Netmiko — SSH to network devices, send commands, parse output. The default for multi-vendor CLI automation.
- NAPALM — vendor-neutral abstraction on top of Netmiko. Pushes config consistently across Cisco/Juniper/Arista.
- Nornir — Python automation framework for network inventory + parallel execution.
- paramiko — low-level SSH library; Netmiko sits on top.
- boto3 — AWS SDK.
- azure-sdk-for-python — Azure SDK.
- requests — HTTP; use it for REST APIs (e.g., calling RESTCONF on a device).
- PyYAML — parse/emit YAML.
- Jinja2 — template engine (also used by Ansible for config templates).
Debugging
Three levels:
print()everywhere. Fine for small scripts.loggingmodule. Levels (DEBUG, INFO, WARNING, ERROR). Turn DEBUG on when troubleshooting; off in prod.pdb(debugger). Putbreakpoint()in your code, run the script, drop into an interactive prompt.
Pythonic idioms worth knowing
- List comprehension —
[x * 2 for x in numbers if x > 0]replaces a loop - Dict comprehension —
{k: v.upper() for k, v in data.items()} - Context managers —
with open(...) as f:automatically closes the file - Walrus operator —
if (n := len(data)) > 100:assigns and tests in one step (3.8+) - Type hints —
def ping(host: str, timeout: int = 5) -> bool:— for readers and linters