Note to self: fixing “weird” Unicode characters like ‘Â’ and ‘’’ in a WordPress SQL database

Photo of rusted typewriter keys by Flickr user darkday
Photo “Retired Keyboard” by Flickr user darkday

This post is based in part on a more detailed post from Jeff Starr at Digging Into WordPress. I’m posting it here so I can easily find the recipe I most often need. Definitely visit Jeff’s post for a full explanation.

99.999% of WordPress users are unlikely to need to import from an older archived website database into a new one. This post is for the remaining few (including me!) who run into character mismatches when they do have to do this.

Note that this post is more of a “shorthand” than a detailed explanation. Note also that you can break your website, badly, if:

  • you don’t have at least one full backup plus a database backup and you’ve tested to make sure you can restore your site from the backups you’ve just made, and
  • you already know what you’re doing.

Run these code snippets in your MyPhpAdmin or Adminer database manager.

First this to clean up post titles

UPDATE wp_posts SET post_title = REPLACE(post_title, 'Â', '');
UPDATE wp_posts SET post_title = REPLACE(post_title, '“', '“');
UPDATE wp_posts SET post_title = REPLACE(post_title, '”', '”');
UPDATE wp_posts SET post_title = REPLACE(post_title, '’', '’');
UPDATE wp_posts SET post_title = REPLACE(post_title, '‘', '‘');
UPDATE wp_posts SET post_title = REPLACE(post_title, '—', '–');
UPDATE wp_posts SET post_title = REPLACE(post_title, '–', '—');
UPDATE wp_posts SET post_title = REPLACE(post_title, '•', '-');
UPDATE wp_posts SET post_title = REPLACE(post_title, '…', '...');

Then this to clean up post content

UPDATE wp_posts SET post_content = REPLACE(post_content, 'Â', '');
UPDATE wp_posts SET post_content = REPLACE(post_content, '“', '“');
UPDATE wp_posts SET post_content = REPLACE(post_content, '”', '”');
UPDATE wp_posts SET post_content = REPLACE(post_content, '’', '’');
UPDATE wp_posts SET post_content = REPLACE(post_content, '‘', '‘');
UPDATE wp_posts SET post_content = REPLACE(post_content, '—', '–');
UPDATE wp_posts SET post_content = REPLACE(post_content, '–', '—');
UPDATE wp_posts SET post_content = REPLACE(post_content, '•', '-');
UPDATE wp_posts SET post_content = REPLACE(post_content, '…', '...');

And yes, there are other characters out there. And other columns in other tables you may also need to clean up (comments for instance.) Adjust your scripts accordingly. There are also “more complete” ways to do this that involve editing database, table, and column character codes.

FacebooktwitterredditpinterestlinkedinmailFacebooktwitterredditpinterestlinkedinmail
Posted in , tagged with: #

David Innes, RealBasics.com

I've been building and maintaining websites since 1997 and building and supporting similar hypertext-driven software since 1987. I've done maintenance, support, and maintenance for physical and digital systems since 1981. And no, I still haven't seen it all but by now I usually know where to look. More about David Innes...