Use SQL to Extract Posts, Categories, Tags, and Comments from WordPress
This post shows how to extract content (posts, categories, tags, and comments) out of a WordPress database using SQL.
SQL Query to Extract WordPress Posts with Categories and Tags
Here’s a SQL query to get the posts out of WordPress along with the associated categories and tags.
SELECT
p.ID id,
p.post_title,
p.post_author,
p.post_date,
p.post_date_gmt,
p.post_excerpt,
p.post_content,
p.post_status,
p.comment_status,
p.post_name,
p.post_modified,
p.post_modified_gmt,
p.post_content_filtered,
p.post_parent,
p.guid,
p.post_type,
p.post_mime_type,
p.comment_count,
GROUP_CONCAT(DISTINCT c.`name`) as categories,
GROUP_CONCAT(DISTINCT t.`name`) as tags
FROM wp_posts p
LEFT JOIN wp_term_relationships cr
ON (p.`id`=cr.`object_id`)
LEFT JOIN wp_term_taxonomy ct
ON (ct.`term_taxonomy_id`=cr.`term_taxonomy_id`
AND ct.`taxonomy`='category')
LEFT JOIN wp_terms c ON
(ct.`term_id`=c.`term_id`)
LEFT JOIN wp_term_relationships tr
ON (p.`id`=tr.`object_id`)
LEFT JOIN wp_term_taxonomy tt
ON (tt.`term_taxonomy_id`=tr.`term_taxonomy_id`
AND tt.`taxonomy`='post_tag')
LEFT JOIN wp_terms t
ON (tt.`term_id`=t.`term_id`)
-- I wanted to exclude a few specific post types
WHERE post_type
NOT IN (
'revision',
'attachment',
'nav_menu_item',
'oembed_cache',
'postman_sent_mail'
)
GROUP BY p.id;
The above query is partially based on this code.
SQL Query to Extract WordPress Comments
This SQL query will extract comments out of WordPress. I used a Ruby script to dynamically insert the post IDs into the SQL. (I created a single, machine-readable JSON file with all the posts and comments for later processing. The next step for me was to convert each post into a markdown file so that they could be loading into Gatsby, which is the framework that I was migrating to.)
SELECT
comment_ID comment_id,
comment_post_ID comment_post_id,
comment_author,
comment_author_url,
comment_date,
comment_date_gmt,
comment_content,
comment_approved,
comment_agent,
comment_type,
comment_parent,
user_id
FROM wp_comments
WHERE comment_post_id = 123 -- change this to your post ID
AND comment_approved = '1';
Using Ruby and mysql2
When I processed my WordPress database, I used a Ruby gem called mysql2. It’s an easy way to perform SQL queries and process each database row as a Ruby hash.
# This isn't working code -- it's just to illustrate the basic idea
require 'json'
require 'mysql2'
@client = Mysql2::Client.new(
host: 'localhost',
username: 'db_username',
password: 'db_password',
database: 'db_name'
)
results = @client.query("
your SQL goes here
")
results.each do |row|
# each row in the database is now accessible in this loop as `row`
end
content = 'put your final output for the file here'
# Turn the Ruby hash into JSON and write it to a file.
json = JSON.pretty_generate(content)
File.open('content.json', 'w') { |f| f.write json }
WP Gatsby Markdown Exporter
If you’re migrating to Gatsby.js, there’s also a WordPress plugin called wp-gatsby-markdown-exporter that might help some people. It can be run as a CLI on the server. I found that it didn’t do exactly what I wanted, but other people might find it useful, and the source code might be a starting point for customization.
Get Help Migrating Your WordPress Site
If you have questions or tips, leave a comment below.