ESP32 Security Camera System

SecurityBeginnerIntermediateAdvanced

Transform the ESP32-CAM into a fully featured security camera that detects motion using a PIR sensor, records JPEG snapshots to an SD card, streams live MJPEG video to a browser, and sends instant Telegram photo alerts when intrusion is detected.

Overview

In this beginner project you will connect a PIR motion sensor to the ESP32-CAM and take a JPEG snapshot when motion is detected. The snapshot is saved to the onboard SD card with a sequential filename. An LED flashes when a photo is captured. The Serial Monitor reports each capture event with the filename. This teaches ESP32-CAM initialisation, PIR interrupt handling, and SD card JPEG file writing.

Components
  • 1× ESP32-CAM module (AI Thinker)
  • 1× HC-SR501 PIR motion sensor — 3.3 V or 5 V; adjustable sensitivity and delay
  • 1× MicroSD card (FAT32, under 32 GB) — Onboard SD slot on AI Thinker module
  • 1× FTDI USB-to-Serial programmer — For upload and Serial Monitor
  • 1× Jumper wires
Wiring
Component PinESP32 PinNotes
PIR OUTGPIO 13Goes HIGH on motion; 3.3 V logic
PIR VCC3.3 V or 5 VHC-SR501 accepts 3.3-12 V
PIR GNDGND
SD cardOnboard (SPI GPIO 2/14/15)Built into AI Thinker module
Onboard LEDGPIO 33 (inverted)LOW = LED on for AI Thinker
Arduino Code
esp32-security-camera-system_beginner.ino
// ESP32-CAM Security Camera - Beginner
// PIR motion trigger -> JPEG snapshot to SD card

#include "esp_camera.h"
#include "FS.h"
#include "SD_MMC.h"

const int PIR_PIN = 13;
const int LED_PIN = 33; // Active LOW on AI Thinker
int photoCount = 0;

void initCamera(){
  camera_config_t c={};
  c.ledc_channel=LEDC_CHANNEL_0; c.ledc_timer=LEDC_TIMER_0;
  c.pin_d0=5;c.pin_d1=18;c.pin_d2=19;c.pin_d3=21;
  c.pin_d4=36;c.pin_d5=39;c.pin_d6=34;c.pin_d7=35;
  c.pin_xclk=0;c.pin_pclk=22;c.pin_vsync=25;c.pin_href=23;
  c.pin_sscb_sda=26;c.pin_sscb_scl=27;c.pin_reset=-1;c.pin_pwdn=32;
  c.xclk_freq_hz=20000000;
  c.pixel_format=PIXFORMAT_JPEG;
  c.frame_size=FRAMESIZE_VGA;
  c.jpeg_quality=10;
  c.fb_count=1;
  if(esp_camera_init(&c)!=ESP_OK)
    Serial.println("Camera init failed");
}

void capturePhoto(){
  camera_fb_t *fb = esp_camera_fb_get();
  if(!fb){ Serial.println("Frame capture failed"); return; }

  String path = "/photo" + String(photoCount++) + ".jpg";
  File f = SD_MMC.open(path, FILE_WRITE);
  if(f){
    f.write(fb->buf, fb->len);
    f.close();
    Serial.printf("Saved: %s (%u bytes)n", path.c_str(), fb->len);
  }
  esp_camera_fb_return(fb);

  // Flash LED briefly
  digitalWrite(LED_PIN, LOW); delay(200); digitalWrite(LED_PIN, HIGH);
}

void setup(){
  Serial.begin(115200);
  pinMode(PIR_PIN, INPUT);
  pinMode(LED_PIN, OUTPUT); digitalWrite(LED_PIN, HIGH);
  if(!SD_MMC.begin()) Serial.println("SD card mount failed");
  initCamera();
  Serial.println("Security camera armed. Waiting for motion...");
}

void loop(){
  if(digitalRead(PIR_PIN) == HIGH){
    Serial.println("Motion detected!");
    capturePhoto();
    delay(5000); // cooldown between captures
  }
  delay(100);
}
How It Works
01

PIR Motion Detection: The HC-SR501 PIR sensor detects infrared radiation changes caused by a warm moving body. Its output goes HIGH for an adjustable duration (0.5-200 seconds via onboard potentiometer) when motion is detected. The ESP32-CAM polls this pin every 100 ms to detect the rising edge.

02

SD_MMC vs SD Library: The AI Thinker ESP32-CAM connects its SD card via the MMC interface (not SPI), which uses GPIO 2, 14, and 15. The SD_MMC library must be used; the standard SPI-based SD library will not work. SD_MMC.begin() without arguments uses the built-in pin assignments.

03

JPEG Frame Capture: esp_camera_fb_get() returns a pointer to a frame buffer containing a complete JPEG-encoded image. The JPEG data (fb->buf, fb->len bytes) is written directly to the SD card file. esp_camera_fb_return() releases the buffer back to the camera DMA pool.

04

5-Second Cooldown: After each capture a 5-second delay prevents filling the SD card with hundreds of nearly identical frames during a single motion event. The PIR sensor also has its own retrigger timer; set the potentiometer to minimum delay for fastest response.

Applications
  • Driveway intruder detection with timestamped photo evidence
  • Wildlife camera trap for animal activity logging
  • Shop entrance customer counter with photo log
  • Package delivery porch camera with motion-triggered capture
Troubleshooting

SD_MMC.begin() returns false

Format the SD card as FAT32 on a computer. Cards over 32 GB formatted as exFAT are not supported. Remove and reinsert the card. Also check that GPIO 4 (SD card data 1) is not used for another purpose; it is part of the SD_MMC interface on AI Thinker.

PIR triggers immediately on power-up

The HC-SR501 has a 60-second initialisation period after power-on during which it may produce false triggers. Add a 60-second startup delay before arming the detection logic.

Photos are very dark

The OV2640 automatic exposure needs a few frames to adjust. Call esp_camera_fb_get() twice and discard the first frame before saving the second. Also add a delay of 300 ms after camera init to allow auto-exposure to settle.

Upgrades
  • Add Wi-Fi and send the captured photo via Telegram immediately after saving to SD
  • Add NTP timestamps to filenames for chronological organisation
  • Add a second PIR sensor for wider coverage area
  • Add a live MJPEG web stream for remote viewing between motion events
FAQ

You need an ESP32 DevKit, PIR OUT, SD card, a breadboard, jumper wires, and a USB cable for power and programming.

Only the Advanced stage uses Wi-Fi. Beginner and Intermediate builds run offline on the ESP32 with USB power.

Start with Beginner if you are new to Security Projects. Use Intermediate for OLED feedback and Advanced for dashboards or connected monitoring.

Overview

The intermediate build adds a live MJPEG video stream accessible from any browser on the local network. When the PIR detects motion, the ESP32-CAM captures a JPEG frame and sends it as a Telegram photo alert with the detection timestamp. Recordings are simultaneously saved to the SD card. A simple web interface at the ESP32 IP address shows the live stream and a list of the last five captured filenames.

Components
  • 1× ESP32-CAM module
  • 1× HC-SR501 PIR sensor
  • 1× MicroSD card (FAT32)
  • 1× Wi-Fi router — Live stream and Telegram alerts
  • 1× Telegram bot token and chat ID
Wiring
Component PinESP32 PinNotes
PIR OUTGPIO 13
SD cardOnboard MMC interface
Arduino Code
esp32-security-camera-system_intermediate.ino
// ESP32-CAM Security Camera - Intermediate (MJPEG stream + PIR + Telegram alert)
#include "esp_camera.h"
#include "FS.h"
#include "SD_MMC.h"
#include <WiFi.h>
#include <WebServer.h>
#include <WiFiClientSecure.h>
#include <time.h>

WebServer server(80);
const char* SSID="YourSSID", *PASS="YourPass";
const char* BOT_TOKEN="YOUR_BOT_TOKEN";
const char* CHAT_ID="YOUR_CHAT_ID";
const int PIR=13;
int photoCount=0;
String lastFiles[5];

void initCamera(){
  camera_config_t c={};
  c.ledc_channel=LEDC_CHANNEL_0; c.ledc_timer=LEDC_TIMER_0;
  c.pin_d0=5;c.pin_d1=18;c.pin_d2=19;c.pin_d3=21;
  c.pin_d4=36;c.pin_d5=39;c.pin_d6=34;c.pin_d7=35;
  c.pin_xclk=0;c.pin_pclk=22;c.pin_vsync=25;c.pin_href=23;
  c.pin_sscb_sda=26;c.pin_sscb_scl=27;c.pin_reset=-1;c.pin_pwdn=32;
  c.xclk_freq_hz=20000000;
  c.pixel_format=PIXFORMAT_JPEG;
  c.frame_size=FRAMESIZE_VGA;
  c.jpeg_quality=10; c.fb_count=2;
  esp_camera_init(&c);
}

void sendTelegramPhoto(camera_fb_t *fb){
  WiFiClientSecure cl; cl.setInsecure();
  if(!cl.connect("api.telegram.org",443)) return;
  String boundary="ESP32Boundary";
  String head="--"+boundary+"rnContent-Disposition: form-data; name="chat_id"rnrn"
    +String(CHAT_ID)+"rn--"+boundary+"rnContent-Disposition: form-data; "
    "name="caption"rnrnMotion Detected!rn--"+boundary+"rn"
    "Content-Disposition: form-data; name="photo"; filename="cam.jpg"rn"
    "Content-Type: image/jpegrnrn";
  String tail="rn--"+boundary+"--rn";
  int len=head.length()+fb->len+tail.length();
  cl.printf("POST /bot%s/sendPhoto HTTP/1.1rnHost: api.telegram.orgrn"
    "Content-Type: multipart/form-data; boundary=%srnContent-Length: %drnrn",
    BOT_TOKEN,boundary.c_str(),len);
  cl.print(head); cl.write(fb->buf,fb->len); cl.print(tail);
  delay(2000); cl.stop();
}

void handleStream(){
  WiFiClient client=server.client();
  client.println("HTTP/1.1 200 OK");
  client.println("Content-Type: multipart/x-mixed-replace; boundary=frame");
  client.println();
  while(client.connected()){
    camera_fb_t *fb=esp_camera_fb_get();
    if(!fb) break;
    client.printf("--framernContent-Type: image/jpegrnContent-Length: %urnrn",fb->len);
    client.write(fb->buf,fb->len);
    client.println();
    esp_camera_fb_return(fb);
    delay(50);
  }
}

void handleRoot(){
  String html="<html><body><h2>ESP32 Security Camera</h2>"
    "<img src="/stream" width="640"><br><h3>Recent captures:</h3><ul>";
  for(int i=0;i<5;i++) if(lastFiles[i].length()) html+="<li>"+lastFiles[i]+"</li>";
  html+="</ul></body></html>";
  server.send(200,"text/html",html);
}

void captureAndAlert(){
  camera_fb_t *fb=esp_camera_fb_get();
  if(!fb) return;
  time_t now=time(nullptr);
  String path="/cam_"+String(now)+".jpg";
  File f=SD_MMC.open(path,FILE_WRITE);
  if(f){ f.write(fb->buf,fb->len); f.close(); }
  for(int i=4;i>0;i--) lastFiles[i]=lastFiles[i-1];
  lastFiles[0]=path;
  sendTelegramPhoto(fb);
  esp_camera_fb_return(fb);
}

void setup(){
  Serial.begin(115200);
  pinMode(PIR,INPUT);
  SD_MMC.begin(); initCamera();
  WiFi.begin(SSID,PASS);
  while(WiFi.status()!=WL_CONNECTED) delay(500);
  configTime(0,0,"pool.ntp.org");
  server.on("/",handleRoot);
  server.on("/stream",handleStream);
  server.begin();
  Serial.printf("Camera: http://%s/n",WiFi.localIP().toString().c_str());
}

void loop(){
  server.handleClient();
  static unsigned long lastMotion=0;
  if(digitalRead(PIR)==HIGH&&millis()-lastMotion>10000){
    lastMotion=millis();
    captureAndAlert();
  }
}
How It Works
01

MJPEG Stream Handler: The /stream endpoint serves a multipart/x-mixed-replace HTTP response. Each part is a complete JPEG image followed by a boundary delimiter. The browser receives an endless stream of JPEG frames and renders them as a continuous video at approximately 5-15 FPS depending on network speed and frame size.

02

Dual Frame Buffer: Setting fb_count=2 enables double-buffering: the camera DMA fills one buffer while the firmware reads the other. This prevents frame tearing in the MJPEG stream and reduces the latency between capture and display compared to single-buffer operation.

03

NTP Timestamp in Filename: configTime() syncs the ESP32 clock. Each capture file is named with the Unix timestamp (e.g. /cam_1750000000.jpg). This provides chronological ordering of captures on the SD card and makes it easy to correlate captures with known events.

04

Recent Captures List: A circular array of 5 strings tracks the last five capture filenames. The root web page renders these as an HTML list so the user can see the most recent activity at a glance without accessing the SD card directly.

Applications
  • Home security camera with live monitoring and motion alerts
  • Remote property monitoring with instant notification
  • Office or shop CCTV with SD card evidence storage
  • Baby monitor with motion-triggered photo alert to parents
Troubleshooting

MJPEG stream freezes after a few seconds

The stream and PIR capture tasks share the camera framebuffer. When captureAndAlert() calls esp_camera_fb_get() while the stream is also waiting for a frame, a deadlock can occur. Add a mutex or use a flag to pause the stream during capture.

Telegram photo is always blurry or overexposed

Add a 300 ms warm-up delay between camera init and the first capture. Also skip the first frame from esp_camera_fb_get() (call once and return, then capture the second frame) to allow auto-exposure to stabilise.

Web stream works in Chrome but not Safari

Safari requires the Content-Length header in each MJPEG part. Add client.printf("Content-Length: %u\r\n", fb->len) before printing the blank line separator in the stream handler.

Upgrades
  • Add night vision by controlling the onboard white LED flash (GPIO 4) on motion
  • Add a second camera for a two-angle view with a stream selector web page
  • Add basic motion detection by comparing consecutive frames pixel-by-pixel without a PIR
  • Add password protection to the web interface using HTTP Digest Authentication
FAQ

You need an ESP32 DevKit, PIR OUT, SD card, a breadboard, jumper wires, and a USB cable for power and programming.

Only the Advanced stage uses Wi-Fi. Beginner and Intermediate builds run offline on the ESP32 with USB power.

Start with Beginner if you are new to Security Projects. Use Intermediate for OLED feedback and Advanced for dashboards or connected monitoring.

Overview

The advanced build integrates face detection from the ESP-FACE library into the motion-triggered pipeline. When a face is detected, the bounding box is drawn on the JPEG before transmission. All events (motion, face detected, face recognised) are published to MQTT with JSON metadata. A Node-RED dashboard shows a live event timeline, the last captured annotated frame, and alert statistics. SD recordings are automatically organised into hourly folders.

Components
  • 1× ESP32-CAM module
  • 1× PIR sensor
  • 1× MicroSD card
  • 1× MQTT broker
  • 1× Node-RED with dashboard
Wiring
Component PinESP32 PinNotes
Same as intermediatePIR on GPIO 13
Arduino Code
esp32-security-camera-system_advanced.ino
// ESP32-CAM Security Camera - Advanced (face detection + MQTT + organised SD)
// Face detection uses esp_face_detect from esp-who component
// Install: add esp-who component via ESP-IDF component manager
#include "esp_camera.h"
#include "FS.h"
#include "SD_MMC.h"
#include <WiFi.h>
#include <PubSubClient.h>
#include <ArduinoJson.h>
#include <time.h>

// face detection (esp-who library):
// #include "human_face_detect_msr01.hpp"
// #include "human_face_detect_mnp01.hpp"

WiFiClient wifiClient; PubSubClient mqtt(wifiClient);
const char* SSID="YourSSID", *PASS="YourPass";
const char* MQTT_HOST="192.168.1.100";
const int PIR=13;

void initCamera(){
  camera_config_t c={};
  c.ledc_channel=LEDC_CHANNEL_0; c.ledc_timer=LEDC_TIMER_0;
  c.pin_d0=5;c.pin_d1=18;c.pin_d2=19;c.pin_d3=21;
  c.pin_d4=36;c.pin_d5=39;c.pin_d6=34;c.pin_d7=35;
  c.pin_xclk=0;c.pin_pclk=22;c.pin_vsync=25;c.pin_href=23;
  c.pin_sscb_sda=26;c.pin_sscb_scl=27;c.pin_reset=-1;c.pin_pwdn=32;
  c.xclk_freq_hz=20000000;
  c.pixel_format=PIXFORMAT_JPEG;
  c.frame_size=FRAMESIZE_VGA;
  c.jpeg_quality=10; c.fb_count=2;
  esp_camera_init(&c);
}

String sdPath(const char* prefix){
  struct tm ti; getLocalTime(&ti);
  char dir[32]; sprintf(dir,"/%04d%02d%02d/%02dh",
    ti.tm_year+1900,ti.tm_mon+1,ti.tm_mday,ti.tm_hour);
  SD_MMC.mkdir(dir);
  char path[64]; sprintf(path,"%s/%s_%02d%02d%02d.jpg",
    dir,prefix,ti.tm_hour,ti.tm_min,ti.tm_sec);
  return String(path);
}

void publishEvent(const char* type, int faces, const String &path){
  StaticJsonDocument<128> doc;
  doc["event"]=type; doc["faces"]=faces; doc["file"]=path;
  char buf[128]; serializeJson(doc,buf);
  mqtt.publish("cam/event",buf);
}

void handleMotion(){
  camera_fb_t *fb=esp_camera_fb_get();
  if(!fb) return;
  // In esp-who IDF project:
  // HumanFaceDetectMSR01 detector(0.3f, 0.3f, 10, 0.3f);
  // auto results = detector.infer(fb);
  // int faceCount = results.size();
  int faceCount=0; // replace with actual detection result
  const char* eventType=(faceCount>0)?"face_detected":"motion";
  String path=sdPath(eventType);
  File f=SD_MMC.open(path,FILE_WRITE);
  if(f){ f.write(fb->buf,fb->len); f.close(); }
  esp_camera_fb_return(fb);
  publishEvent(eventType,faceCount,path);
  Serial.printf("Event: %s faces: %d file: %sn",eventType,faceCount,path.c_str());
}

void setup(){
  Serial.begin(115200);
  pinMode(PIR,INPUT);
  SD_MMC.begin(); initCamera();
  WiFi.begin(SSID,PASS);
  while(WiFi.status()!=WL_CONNECTED) delay(500);
  configTime(0,0,"pool.ntp.org");
  mqtt.setServer(MQTT_HOST,1883);
}

void loop(){
  if(!mqtt.connected()) mqtt.connect("SecurityCam");
  mqtt.loop();
  static unsigned long last=0;
  if(digitalRead(PIR)==HIGH&&millis()-last>10000){
    last=millis();
    handleMotion();
  }
}
How It Works
01

Organised SD Card Folder Structure: Captures are organised into folders by date and hour: /YYYYMMDD/HHh/. The mkdir() call creates the directory tree if it does not exist. This structure makes it easy to locate footage from a specific time period without scrolling through thousands of files in a single flat directory.

02

ESP-FACE Two-Stage Detection: The esp-who face detection pipeline uses two neural network stages: MSR01 (fast, approximate) filters the full frame for face candidates; MNP01 (accurate, slow) refines each candidate. Running both stages on a VGA frame takes 150-300 ms on the ESP32-CAM, yielding 3-6 detections per second.

03

Event-Driven MQTT Publishing: Each motion or face detection event publishes a JSON object with event type, face count, and captured file path. Node-RED subscribes to cam/event and maintains a live event log. Different event types trigger different automations: motion events log to a timeline; face events trigger Telegram alerts.

04

Node-RED Alert Timeline: A Node-RED function node maintains an array of the last 50 events with timestamps, event types, and face counts. A UI text widget renders this as a scrollable event log. A UI chart widget plots event frequency per hour, highlighting unusual activity periods.

Applications
  • AI-enhanced home security with face detection alerts
  • Access control audit trail with face detection logs
  • Retail customer flow analysis with people counting
  • School entrance monitoring with anonymous presence detection
Troubleshooting

Face detection not available in Arduino IDE

esp-who requires the ESP-IDF build system. Use ESP-IDF with the esp-who component for the advanced build. The Arduino IDE sketch above shows the integration pattern but the face detection calls must be replaced with the actual ESP-IDF C++ API.

SD card organisation creates too many directories

FAT32 supports a maximum of 65,535 files and directories per directory. With 24 hourly folders per day this will not be a practical limit. If scanning directories manually becomes slow, use a flat naming scheme with timestamp prefixes instead.

MQTT event bursts when motion is continuous

The 10-second cooldown (millis()-last>10000) prevents event floods. For busy scenes, increase the cooldown to 30 seconds. Add a minimum face confidence threshold to avoid publishing events for low-confidence detections.

Upgrades
  • Add face recognition to identify known individuals and only alert on unknown faces
  • Add two-way audio using the ESP32-S3 I2S microphone and speaker for a video doorbell
  • Add cloud backup: upload footage to an S3-compatible storage bucket via HTTP PUT
  • Add a Grafana dashboard for long-term motion frequency heatmap visualisation
FAQ

You need an ESP32 DevKit, PIR OUT, SD card, a breadboard, jumper wires, and a USB cable for power and programming.

Only the Advanced stage uses Wi-Fi. Beginner and Intermediate builds run offline on the ESP32 with USB power.

Start with Beginner if you are new to Security Projects. Use Intermediate for OLED feedback and Advanced for dashboards or connected monitoring.