The DIY video doorbell with voice response is better than anything you can buy

Disclosure: This post contains affiliate links. If you click through and make a purchase, I will earn a commission, at no additional cost to you. Read my full disclosure here.

In the fascinating world of ESPHome projects, one innovation shines particularly bright: the ESPHome video doorbell with voice response. This creation is not only a significant advancement over commercial alternatives such as the Ring doorbell in terms of appearance, but also in certain functional areas. With its sleek and user-friendly design, coupled with flawless Home Assistant integration via ESPHome, it stands as a testament to the ingenuity and potential of DIY smart home solutions.


Why a Local-Only ESPHome Video Doorbell Makes Sense

Privacy, security, reliability, and control are some advantages of a local-only video doorbell over its internet-connected counterparts. Key benefits include enhanced privacy due to local data storage, dependable performance independent of internet connectivity, no subscription fees, complete control over data, and long-term availability unaffected by external cloud services. These features make local-only systems an attractive option for those prioritizing privacy and security in their smart home setups.

Addressing Limitations: Audio and Waterproofing Challenges

Despite its impressive attributes, the ESPHome Video Doorbell does have limitations, notably the absence of bidirectional audio. Presently, the doorbell can play pre-recorded MP3 files using a DFPlayer Mini, but it lacks the capability for two-way communication due to ESPHome's current limitations. There is a pending feature request for this functionality, although it hasn't seen development yet.

Another aspect to consider is waterproofing. The original creator didn't address this, as rain exposure wasn't a concern. However, for those in different climates, implementing waterproof measures like O-rings or silicone sealants is advisable.

A self-made video doorbell featuring a camera, speaker, and illuminated button. The chassis of the doorbell is white and black.

Understanding the ESPHome Video Doorbell with Voice Response

This ESPHome project combines four distinct components into a cohesive system. At the heart of everything is, perhaps unexpectedly, an ESP32 with a camera module (frequently referred to as an ESP32-Cam):

  1. ESP32 with Camera Module: Serves as the primary doorbell mechanism.
  2. Wall-Mounted Tablet: This device displays the camera feed and offers options for pre-recorded responses when the doorbell is pressed.
  3. Smartphone Notification: Sends alerts to the user's phone with a screenshot and response options.
  4. Doorbell Chime: Utilizes a LOLIN (formerly WEMOS) D1 mini and a DFPlayer Mini, connected to amplified speakers, and also runs ESPHome.

These components work in tandem, as showcased in the demonstration videos:

YouTube video
YouTube video

The ESPHome Doorbell: Camera and Speaker Integration

The doorbell unit, the centrepiece of this project, employs an ESP32 board with a dedicated camera module with a long flex cable, a DFPlayer Mini, a speaker, and an LED-illuminated button. The flex cable enables a 90° connection between the ESP32 board and the camera module, maintaining the camera's upright position. A strong signal is ensured by the ESP32's external antenna. The front features a laser-cut lens insert and custom-cut acrylic glass, which could alternatively be 3D printed for a different aesthetic.

The ESP32 is a crucial component due to its capability to support camera modules, a feature that is not available in alternatives like the Raspberry Pi Pico W or the ESP8266. The ESP32 comes equipped with sufficient processing power and memory, which are essential for handling the data-intensive tasks associated with capturing and processing video feeds.

The Chime: Customizable Audio Alerts

The custom-built chime, powered by another ESPHome node, loads sounds onto a microSD card in the DFPlayer Mini. This setup, connected to computer speakers via a 3.5 mm jack, is flexible and can be placed out of sight.

The main benefit of using another ESPHome node as a doorbell chime is that you can build as many individual nodes as you need or want and place throughout your smart home. Once created, a few lines of YAML with have a new node integrated in to the system almost instantly.

Esphome Video Doorbell 02

Replicating the ESPHome Video Doorbell

To recreate this project, modifications may be necessary. The 3D-printed case is designed for specific door dimensions, so adaptations may be required. Additionally, the creator has generously shared the code, including the Home Assistant Dashboard setup, automations, and scripts.

While the wall-mounted tablet enhances the experience, it's not essential. Alternatives like the Android/iOS companion app or a persistent notification in the dashboard can serve similar purposes.

In summary, this DIY video doorbell with voice response represents a significant leap over commercial offerings. It blends aesthetics, functionality, and customization, offering a unique and sophisticated solution for the smart home enthusiast.

A portrait photo oif Liam Alexander Colman, the author, creator, and owner of Home Assistant Guide wearing a suit.

About Liam Alexander Colman

is an experienced Home Assistant user who has been utilizing the platform for a variety of projects over an extended period. His journey began with a Raspberry Pi, which quickly grew to three Raspberry Pis and eventually a full-fledged server. Liam's current operating system of choice is Unraid, with Home Assistant comfortably running in a Docker container.
With a deep understanding of the intricacies of Home Assistant, Liam has an impressive setup, consisting of various Zigbee devices, and seamless integrations with existing products such as his Android TV box. For those interested in learning more about Liam's experience with Home Assistant, he shares his insights on how he first started using the platform and his subsequent journey.

Leave a comment

Share to...