WordPress as a REST Proxy for PhantomBuster & Google Apps Script

The Tribify Engine uses webhooks to tie PhantomBuster automation to our Google Apps Script back end... and failed at it, until WordPress came to the rescue!
Share on linkedin
Share on twitter
Share on facebook
Share on email

Contents

This is one of those blog posts that will only be useful to a handful of people… but if you need it, you REALLY need it!

The Problem

Behind the scenes, the Tribify Engine uses PhantomBuster to automate activity on social platforms. Each execution passes its data payload to a Google Apps Script (GAS) web application via a webhook configured in the Phantom’s advanced settings, like this:

This Phantom calls a GAS webhook after execution.

In principle, this works extremely well: it’s how we realize the Provider Model for social platform automation, with PhantomBuster as the primary provider. But in practice, this architecture creates two issues that anybody who hooks PhantomBuster into GAS is going to run into. One is just a pain in the butt, and the other is extremely serious.

At the pain-in-the-butt level, every time you deploy a new version of a GAS Web App, you get a new Web App URL. Each Tribify Engine instance runs in its own GAS Web App and includes a number of Phantoms, and the new URL has to be distributed across ALL of them.

This is more than just a scalability issue! Introducing friction into the release process means we naturally release less often. This has a predictable impact on code quality.

But here’s a much larger issue…

All GAS web apps live at domain script.google.com, but for security reasons all GET and POST responses are redirected to domain script.googleusercontent.com (see the GAS documentation). Meanwhile, PhantomBuster treats any redirection in the webhook response as an error (see the PB documentation). And after a few failed redirects, PhantomBuster’s automatic response is to remove the offending webhook!

If your Phantom runs again before you reinstate the webhook, any data it generated will be lost (ok, technically it’s preserved in a CSV, but your GAS application won’t see it). There is no obvious way to disable this behavior on either the PhantomBuster or the GAS side, leaving you with an impossible game of whack-a-mole that scales right along with your application: the more Phantoms you are running, the bigger your problem is!

The Solution

To solve the problem, we turned to our WordPress website and did the following:

  • We created a new Custom Post Type, called instance, that is uniquely identified by the post slug and has a field called gas_web_app_url. Guess what goes there.
  • We created a new REST endpoint that receives a webhook POST from PhantomBuster, looks up the instance based on a query parameter, and passes the POST on to the appropriate GAS Web App. This required two new blocks of code in our functions.php file: an endpoint definition and a callback function.

The result is a HUGE improvement! Now we only have a single maintenance point for GAS Web App URL updates: instead of updating many Phantoms per instance, we just update the instance record on the website. And while GAS is still redirecting its response, that is happening behind our website’s domain, and PhantomBuster never sees the redirection. So our webhooks never get removed, and the whack-a-mole game is over!

Custom Post Type

There are a lot of ways to create one of these. We like the Pods Framework.

Our CPT uses the post slug to uniquely identify each instance and relate it to its gas_web_app_url. Here’s what this looks like on the back end:

The instance Custom Post Type

Endpoint Definition

We define a custom POST endpoint at /wp-json/engine/post. POSTs to this endpoint require three query parameters. Two of them (source and key) are specific to our use case and don’t need any discussion here.

The third query parameter is instance. It corresponds to the slug of the instance object we defined with the Pods Framework, and is how we look up the corresponding GAS Web App URL. Note the callback validation function on the instance parameter: it validates that an instance object with a slug matching the instance parameter (a) exists and (b) has its gas_web_app_url field populated.

Also note that we designated tribify_engine_post as our callback function for the endpoint. We’ll get into that next, but meanwhile here’s the endpoint definition:

add_action('rest_api_init', function () {
  register_rest_route( 'engine', '/post/', array(
    'methods' => 'POST',
    'callback' => 'tribify_engine_post',
    'args' => array(
      'instance' => array(
        'required' => true,
        'validate_callback' => function($param, $request, $key) {
          $instance = pods('instance', $param);
          if (!$instance->exists()) {
            return new WP_Error('instance_invalid', 'Invalid Instance');
          }

          $gas_web_app_url = $instance->field('gas_web_app_url');
          if (empty($gas_web_app_url)) {
            return new WP_Error('gas_web_app_url_undefined', 'Instance Google Apps Script Web App URL undefined');
          }
        }
      ),
      'source' => array('required' => true),
      'key' => array('required' => true)
    )
  ));
});

Callback Function

The callback function composes the GAS Web App URL and passes the POST on to GAS.

First we need to retrieve the gas_web_app_url field of the appropriate instance. Thanks to parameter validation in the endpoint definition, we know this value exists, so we can just go get it.

Besides the body of the POST, our GAS application also expects metadata in query parameters. We used up the instance parameter to look up gas_web_app_url, so we can drop it from the parameter array. Then we use the http_build_query function to rehydrate the query string and tack it onto gas_web_app_url.

We can then use wp_remote_post to send the raw body of the incoming post straight on to GAS!

As a matter of convenience, we made a couple of simplifying decisions:

  • Even though we can set an arbitrary timeout on the WordPress side, we have no control over how long PhantomBuster will wait for a response without timing out. So we just set blocking to false in the outgoing web request and leave it up to GAS to succeed or fail at processing the POST.
  • PhantomBuster reports an error no matter what kind of response we return, so just to keep things simple we explicitly return a 200. This creates a lot of emails (which we ignore) but doesn’t appear to affect processing at all.

Each of these decisions has created some tech debt, but it’s a small price to pay for solving the larger problems! Here’s the callback function:

function tribify_engine_post($request) {
  $instance = pods('instance', $request['instance']);
  $gas_web_app_url = $instance->field('gas_web_app_url');

  $params = $request->get_query_params();
  unset($params['instance']);

  $post_url = $gas_web_app_url . '?' . http_build_query($params);
  $response = wp_remote_post($post_url, array(
    'method' => 'POST',
    'blocking' => false,
    'body' => $request->get_body()
  ));

  return new WP_REST_Response('OK', 200);
}

Email Filters

Every single one of these executions generates an error email from PhantomBuster. Most of them you don’t care about, because nothing is actually broken… but if PhantomBuster should ever delete one of your webhooks, you definitely want to know!

We use Office 365, so our answer is a Mailbox rule. Two, actually.

The first one looks for emails with the appropriate pattern (see below) and deletes them, unless it finds a webhook removal notice in the body. This will vastly reduce your noise, but the volume of deleted email may still choke your Outlook mail file. So the second rule applies exactly the same logic to permanently delete the emails. This action needs a second rule because it can only run on the local client.

Mailbox rules to eliminate the noise.

Just to be clear: the need to create these mailbox rules is a direct consequence of the tech debt generated in the previous section. We get that!

Still a good trade. 😂

The Result

We had to change the webhook URLs in each of our Phantoms one last time… but that’s it! From here on out, whenever we release a new Tribify Instance version in GAS, we just need to copy the new Web App URL to the instance record in WordPress as described above.

All of our Phantom configurations are now fixed, and thanks to the REST proxy provided by our website, response redirects will no longer cause PhantomBuster to nuke our webhooks.

That’s a HUGE win!

Stable, non-redirected webhook URL.

Leave a Reply

Contents

Categories

Archives