Cartelera Scrap: WordPress Plugin for Validating Theater Schedules in Mexico


I’ve developed a WordPress plugin called Cartelera Scrap, which automates the validation of theater showtimes published on two different platforms: CarteleraDeTeatro.mx and Ticketmaster Mexico.

Here you can see a demo of the results of the plugin for one scrap executed nightly:

GitHub code: https://github.com/cobianzo/cartelera-scrap/

The goal is to ensure that schedules published on both sites match. The system automatically scrapes shows and compares the extracted information to detect inconsistencies, which can either be displayed or sent via email.

Technologies and Tools Used

  • WordPress + wp-env for an isolated Docker-based development environment.
  • PHP 8.3, using OOP and following WordPress VIP and Alley Interactive coding standards.
  • Composer to manage PHP dependencies and configure phpcs, phpcbf, phpstan, and PHPUnit.
  • Node.js + npm for development scripts and task automation.
  • PHPUnit 9.4 with watcher for E2E testing of the plugin’s main features.
  • Custom HTML scraping logic in PHP to parse dates and times from free text.
  • WordPress Cron Jobs to automatically process batches of shows.
  • WP-CLI and custom scripts to manage options, cron jobs, and database actions during development.

How the Plugin Works

  1. Initial scraping: the system scrapes carteleradeteatro.mx/todas and saves the list of shows in the options table as a processing queue.
  2. Iteration: it processes each show, scraping data from both platforms and comparing the schedules.
  3. Export: results can be exported in json
  4. Error detection: mismatches in schedules are displayed, showing which dates don’t match.
  5. Automation: two cron jobs — one daily and one recursive — keep the data up to date without manual intervention.
  6. Testing: includes unit and E2E tests validating the full logic: scraping, parsing, and comparing dates/times.

Modular Project Structure

  • Scraper_Cartelera and Scraper_Ticketmaster for data extraction.
  • Text_Parser to convert free-text dates/times into structured formats.
  • Results_Processor to validate matches and generate reports.
  • Admin settings page with advanced configuration options.

Next Steps

  • Add Slack or Telegram notifications.
  • Scheduled synchronization with external systems via API.
  • UI for visualizing detected mismatches with filtering options.

#WordPress #PHP #WebScraping #Plugins #OpenSource #Theater #Culture #Docker #Testing #PHPUnit #Composer #npm #Alley #WPCLI #Automation #WordPressVIP