/ coding

Converting a Jekyll blog to Ghost

Some of my colleagues and I have been playing around with Ghost lately. It's a nice blogging platform -- much leaner than WordPress. And though it's more limited in what it can do than WordPress, it's also much easier to get started with. The interface is clean, and it's easy to know where to click and what to do when you just want to write. It's not quite as slick as Medium in that regard, but it's pretty close. And the closest I've seen yet among open-source software you can host yourself.

Ghost has a great WordPress plugin to export your content from a WordPress site into a Ghost blog. (Though you have to move your images folder manually.) But my site, pushpullfork.com, is on Jekyll, hosted by GitHub! So to play around with Ghost on a fully built-out blog, I decided to spend a good chunk of my morning figuring out how to convert all those Jekyll posts into a Ghost blog.

If you want to try it out, here's the code! If you want to hack your way through it, or get some guidance tweaking that code to work for your blog, keep reading...


One of the great things about going from Jekyll to Ghost is that both are built around MarkDown. You write your posts in MarkDown, and the platform makes it pretty. But Jekyll also uses a somewhat indiosyncratic metadata structure, YAML. Each blog post is a MarkDown file with a YAML header, something like:

layout: post
title: "#MacronLeaks - how disinformation spreads"
modified: 2017-05-19 10:54:16 -0400
  feature: disinfo.jpg
  teaser: disinfo-teaser.jpg
share: true
categories: [data science, propagandalytics]
short_description: "It's the combination of catalyst accounts and an army of signal boosters ― a number of which are bots ― that allows disinformation to spread quickly."

It's a relatively clean way of declaring your metadata at the top of your MarkDown file, but it's not supported by Ghost. Thankfully, Python (my scripting language of choice for projects like this) has a YAML framework for dealing with this kind of data structure, making it relatively easy to convert YAML data into the JSON data structure that Ghost expects. In fact, yaml.load() will take your data, parse it into a Python dictionary (preserving any hierarchy in the original YAML structure), and then you can use json.dump() to write the resulting data into a JSON file.

But it's not quite that simple.

As can be expected, different blogging platforms call things by different names -- Ghost and Jekyll are no exceptions. More frustrating, though, is that not all Jekyll themes use the same names! So if you want to do what I did, you can't just use my script. You'll probably have to make some modifications to make it work for you. But here's what I did:

I started by creating a blank object with all of the same parameters that the Ghost WordPress exporter contains (with a few additional parameters for Facebook's Open Graph and Twitter Cards). Then I assigned the new post_object values from the header of the original Jekyll post:

post_object['title'] = post_object['og_title'] = post_object['twitter_title'] = header['title']
post_object['meta_title'] = header['title']
post_object['slug'] = filename[11:].split('.')[0]
    post_object['meta_description'] = post_object['og_description'] = post_object['twitter_description'] = header['short_description']
    post_object['meta_description'] = post_object['og_description'] = post_object['twitter_description'] = ''
    post_object['created_at'] = post_object['updated_at'] = post_object['published_at'] = str(header['modified'])
    post_object['created_at'] = post_object['updated_at'] = post_object['published_at'] = str(header['date'])
post_object['markdown'] = header['content']
    post_object['image'] = post_object['og_image'] = post_object['twitter_image'] = '/content/images/' + header['image']['feature']
    post_object['image'] = post_object['og_image'] = post_object['twitter_image'] = ''

If you use a Jekyll theme close enough to mine, you might get lucky! Otherwise, you'll have to tweak this part of the code to make your Jekyll-to-Ghost conversion work.

Once you get things into shape, this script should give you a single JSON file containing the metadata and content for every post in your Jekyll blog. (Pages coming soon.)


Images don't come by default. Hopefully, all your Jekyll images are in a single folder (most of mine were). Then you can just zip them up, upload the zip file to your Ghost server, and unzip them! That's what I did. However, I did need to change the folder path to accomodate Ghost's directory structure. You can see that in action with:

header['content'] = parsed[2].replace('/assets/images/', '/content/images/')

That replaces the old folder (/assets/images/) with the new path (/content/images/). Depending on your directory structure, you may also need to adjust that code to work for you.

Importing into Ghost

Once your JSON file is successfully created, use Ghost's import tool (in Labs, as of version 1.0) to import your Jekyll export into Ghost. If there are mistakes, use the delete option in Labs to delete your blog database, fix your JSON file (or image directory), and re-import the corrected file.


I'm hoping to get to pages soon. It's the same process, but because I don't have them all in the same folder, it's not quite as automated. In some cases, I also have dedicated images hosted in other folders to support those pages, so I'll need to do more customization there. However, the process is similar: convert the YAML and MarkDown to JSON, and adjust image links. The main difference will be assigning "page": 1 (i.e., TRUE) to the static pages, otherwise the data is the same.


I'll update later when I've worked on static pages. In the mean time, I'll try to write a few posts here to get the hang of the new platform, and then we'll see if I end up keeping it. Thankfully, going from JSON to YAML isn't any harder than YAML to JSON, so if I scrap Ghost, going back to Jekyll should be easy! I'll keep you posted.

Happy blogging!

Photo by Christoph Bengtsson Lissalde on Unsplash