scriptygoddess

21 May, 2010

The Blogger to WordPress move from hell!

Posted by: Jennifer In: WordPress|WordPress: Lessons Learned

One of the projects I've been working on recently is to move a client's Blogger blog into WordPress. The blog is a few years old, but it has an ENORMOUS amount of comments. Doing the standard import from within WordPress wasn't bringing over all the posts and it barely scratched the surface of bringing over the comments.

I found this post on ArtLung: Migrating an old Blogger Blog To WordPress – which was a HUGE help. The gist of that was you need to download an export file from Blogger, then get this script you run from the terminal (on Mac) which will convert the Blogger xml into a xml that WordPress will understand. (Just to give you an idea what we're talking about – the Blogger XML file I downloaded was over 40MB.)

THEN the fun began. After the conversion to WordPress XML – the file was still about 30MB. This meant that I had to run the import over a dozen times before it could chew through the whole file. When the import times out and you run it again, it recognizes that some posts are already there, so it doesn't import them a second time – but for some reason, it seemed to have trouble with a bunch of comments. Some of the comments DID get imported more than once. As well – the comment counts for each post weren't getting updated. SO! Here's the fix I pieced together for those issues.

First, to remove the duplicate comments – in PHPMyAdmin – I ran the following SQL command: (*DISCLAIMER: make sure you backup your database before attempting any of this!!)

CREATE TABLE temp_table AS SELECT * FROM wp_comments WHERE 1 GROUP BY `comment_post_ID`,`comment_content`,`comment_date`;

What that does is create a duplicate of the wp_comments table – but it's only grabbing the unique entries. If a comment is posted to the same post, has the same content and the same date – we'll just be grabbing one of those and ignoring the rest.

UPDATED TO ADD: One important thing I need to add in here is that you need to set the primary keys and a few indexes for this new table – This is the structure of a normal wp_comments table:

CREATE TABLE `wp_comments` (
`comment_ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`comment_post_ID` bigint(20) unsigned NOT NULL DEFAULT '0',
`comment_author` tinytext NOT NULL,
`comment_author_email` varchar(100) NOT NULL DEFAULT '',
`comment_author_url` varchar(200) NOT NULL DEFAULT '',
`comment_author_IP` varchar(100) NOT NULL DEFAULT '',
`comment_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`comment_date_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`comment_content` text NOT NULL,
`comment_karma` int(11) NOT NULL DEFAULT '0',
`comment_approved` varchar(20) NOT NULL DEFAULT '1',
`comment_agent` varchar(255) NOT NULL DEFAULT '',
`comment_type` varchar(20) NOT NULL DEFAULT '',
`comment_parent` bigint(20) unsigned NOT NULL DEFAULT '0',
`user_id` bigint(20) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`comment_ID`),
KEY `comment_approved` (`comment_approved`),
KEY `comment_post_ID` (`comment_post_ID`),
KEY `comment_approved_date_gmt` (`comment_approved`,`comment_date_gmt`),
KEY `comment_date_gmt` (`comment_date_gmt`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=2 ;

THEN, I ran this SQL command;

DROP TABLE wp_comments;
RENAME TABLE temp_table TO wp_comments;

What that does is delete the wp_comments table, then we rename our new and improved "temp_table" so that it is now the official wp_comments table.

NEXT – we want to update the comment counts. To save having to recount the comments each time – there is a field in the wp_posts table that stores the total number of comments. However, with the way this import worked, those numbers did not get updated so most were saying there were 0 comments on posts that did have comments. To fix that issue I ran the following SQL command: (which I found here on Planet MySQL)

UPDATE wp_posts wpp
LEFT JOIN
(SELECT comment_post_id AS c_post_id, count(*) AS cnt FROM wp_comments
WHERE comment_approved = 1 GROUP BY comment_post_id) wpc
ON wpp.id=wpc.c_post_id
SET wpp.comment_count=wpc.cnt
WHERE wpp.post_type IN ('post', 'page')
AND (wpp.comment_count!=wpc.cnt OR (wpp.comment_count != 0 AND wpc.cnt IS NULL));

Which as far as I can tell gets the comment counts for each post in wp_posts. (I can't tell you more than that because once you start getting into the "LEFT JOIN" stuff in sql, my brain shuts off)

The other piece advice I'll give is if you have a ginormous blog that you have to move like the one I was working with – set the blog up locally using MAMP, do the imports, all the database correcting stuff – then just export the database and import it onto your server. The database file ended up only being 14MB uncompressed – way better than trying to chew through a 30MB import file. Once the database is up on your server, a few more SQL commands (just like if you're moving your blog to a new domain) to change the local version (probably http://localhost:8888 ) to your actual domain name, and you're done!

Another tidbit to add: if you need to redirect blogger formatted url (that may include ".html" in the end) – add this to your htaccess file and it should redirect to the new (non extension ending) URL:

RewriteEngine On
RewriteRule (.*)\.html $1 [R=301,L]

(I tooke off the first line and put the second line under the "RewriteEngine On" that WordPress has in it's htaccess block and it worked great)

7 Responses to "The Blogger to WordPress move from hell!"

1 | rick

May 22nd, 2010 at 10:54 am

Avatar

Could you not simply increase the script timeout limit (http://php.net/manual/en/function.set-time-limit.php) to get the import to run as one job?

2 | Kristina

July 4th, 2010 at 7:05 am

Avatar

AWESOME. I just used your advice, combined with a method from blogbloke.com, to change my .htaccess file to allow me to forward my Blogger articles while still using the better permalink structure on WordPress, slug only, without the dates in the URL. Thank you SO MUCH for publishing your experience with this!

RewriteEngine On
RewriteRule [0-9]{4}/[0-9]{2}/(.*)\.html $1 [R=301,L]

3 | mark

July 23rd, 2011 at 1:39 pm

Avatar

OMG this is exactly what Im looking for. However, Im confused. I created the new temp-table, but Im not sure I understand where you say to update primary keys, etc… Would you mind telling me exactly what I need to do there? Im at step 1 where I've created the new table but I dont know what to do next. Thank you soo much in advance

4 | Amy

September 2nd, 2011 at 11:43 pm

Avatar

I have the same question as Mark about the step where you said to change the primary keys and add some indexes. Can you tell me how to do this?

5 | Jennifer

September 3rd, 2011 at 12:09 am

Avatar

It's been over a year now since I had to do that – but I think the critical part is this:
PRIMARY KEY (`comment_ID`),
KEY `comment_approved` (`comment_approved`),
KEY `comment_post_ID` (`comment_post_ID`),
KEY `comment_approved_date_gmt` (`comment_approved`,`comment_date_gmt`),
KEY `comment_date_gmt` (`comment_date_gmt`)

That indicates which is the primary key and which is an index key – you can go into phpmyadmin and flag those fields (I think in the structure view) as primary key (in the case of the "comment_ID" field, and index key (as in the case of the others…

6 | Amy

September 6th, 2011 at 2:11 am

Avatar

Thank you Jennifer, that helped a lot. I have one more question…is there a command that would change all of a certain comment_post_id's to a different number? So for example, change all the id's that are "15,775" to "3853".

7 | Jennifer

September 6th, 2011 at 10:00 pm

Avatar

Probably something along the lines of this (TEST THIS on a DEV COPY of the database first!!!)

UPDATE `wp_comments` SET `comment_post_ID` = '3853' WHERE `comment_post_ID` = '15775';

Featured Sponsors

Genesis Framework for WordPress

Advertise Here


  • Scott: Just moved changed the site URL as WP's installed in a subfolder. Cookie clearance worked for me. Thanks!
  • Stephen Lareau: Hi great blog thanks. Just thought I would add that it helps to put target = like this:1-800-555-1212 and
  • Cord Blomquist: Jennifer, you may want to check out tp2wp.com, a new service my company just launched that converts TypePad and Movable Type export files into WordPre

About


Advertisements