The Man in Blue

Menu

Using PhantomJS with PHP to screenshot webpages

I’m working on a little project at the moment (stay tuned!) that needs to render dynamic images with text as part of a web service, so I thought I would take a look at running PhantomJS on my server. HTML works so well for text rendering and layout, so generating designs in a browser makes it easy to create arrangements of images and text.

If you’re not familiar with it, PhantomJS is a headless WebKit browser that is controlled via server-side JavaScript (with Node.js) and lets you take screenshots of webpages. I didn’t want to invest time in setting up a separate web server to run Node.js, so I thought I would explore calling PhantomJS as a command-line application from PHP, leveraging my server’s existing Apache setup.

After a quick spike on my local machine I was pretty confident that running PhantomJS would work on my proper web server, but when I started implementing it there and trying to access it through a browser, invisible bugs started causing me to pull out my (already sparse) hair.

This is the Node.js code that I was using locally to generate a screenshot of a webpage from the command-line:

var phantom = require('phantom');

phantom.create({parameters: {'web-security': 'no'}}, function(ph) {
    ph.createPage(function(page) {
        var html = 'Aloha!';
        page.setContent(html, 'http://themaninblue.com');
        page.render('image.jpg', {format: 'jpg'}, function () {
            ph.exit();
	});
    });
});

To call that, you can place it in a JS file and pass it to the Node.js interpreter: node index.js

Aloha!
A webpage captured and saved as an image by PhantomJS

To execute that command-line Node.js app from a web-accessible PHP script, you can use the exec() function. However, calling:

exec('node index.js');

From within my PHP file wasn’t getting me anything – no images, no errors, no output. Just silence.

For a couple of hours I tried a bunch of different ways of writing the Node.js code, trying to figure out why it worked on the command-line but not through a browser: using absolute vs. relative paths in the exec() function; changing file permissions to 777; and even calling node using sudo (bad idea!) None of it worked.

What I eventually figured out was that even though I could tell exec() exactly where the PhantomJS binary lived, when the PHP script gets accessed through a browser it gets run by the www-data user that has no idea where all of the stuff that PhantomJS depends on is located. When I was running the script locally it could take advantage of my user’s PATH environment variable, but www-data didn’t have that information.

In order to fix that you have to give the PATH information to the www-data user, which means using the putenv() function in your PHP script to define a PATH environment variable. Something like this:

putenv('PATH=/usr/local/sbin:/usr/local/bin');

Bingo! That worked and the PhantomJS code started fully executing. Your PATH might vary, so copy it from your own user’s environment variables on your server.

However, although the code was executing more completely, I was still running into a problem generating the final image from PhantomJS. I tried a bunch of different directory and file permissions and no files were being written by the script, so I ended up refactoring the Node.js code so that it used the renderBase64() function rather than the render() function. This then requires the PHP script to write out the file itself using the base64 string passed from Node.js.

This is what my Node.js file ended up looking like:

var phantom = require('phantom');

phantom.create({parameters: {'web-security': 'no'}}, function(ph) {
    ph.createPage(function(page) {
        var html = 'Aloha!';
        page.setContent(html, 'http://themaninblue.com');
        page.renderBase64('PNG', (content) => {
            process.stdout.write(content);
            ph.exit();
        });
    });
});

And this is the PHP that executes it:

putenv('PATH=/usr/local/sbin:/usr/local/bin');
exec('node index.js', $base64Image);
$base64Binary = base64_decode($base64Image[0]);
file_put_contents('image.jpeg', $base64Binary);

Hopefully this is helpful to anyone else who hits the same problem while trying to use PhantomJS on their Apache server. If not, it’ll be a good reminder to me when I forget all about it while reimplementing one of my own projects.

Cameron Adams Cameron Adams is a co-founder and Chief Product Officer at Canva, where he leads the design & product teams and focuses on future product directions & innovative experiences. Read a bit more about him ›