How I moved all my content from comeacross.info to raoulpop.com

A place in this world

Background info

As midnight approached this past New Year’s Eve, I was busy working on a long-term project. I was about to move all of my content (every article and post I’d written) from comeacross.info to raoulpop.com. There were many reasons for this, but consolidation was the most readily apparent.

As detailed on my About page, I’d already combined my content from other sites of mine onto comeacross.info, but there was one more piece of the puzzle that needed to fall into place. I’d alluded to it already. I was thinking about doing it in 2006, believe it or not. As a matter of fact, when I sat down and thought about whether to start writing at comeacross.info or raoulpop.com, I knew deep down I should choose to start writing on my personal domain, but worried it might be too difficult for people to remember and type the name.

After a year or so at ComeAcross, I realized that the subjects I was writing about were much too varied for a standalone site. I was writing in a personal voice, using a lot of 1st person, and it only made sense to have that sort of content reside on my personal site. Plus, there were so many splogs (spam blogs) on the .info TLD, that I worried whether I would be taken seriously if I stayed on .info. I’d owned raoulpop.com for a long time, I wasn’t really putting it to good use, and it didn’t make sense not to.

I set a deadline of 12/31, and got to work on planning and research. What better time for such a big change as this than New Year’s, right?

I’m documenting this for you because someone else might need to know how to do it. And I figure the thought process that went on behind the scenes is also worth knowing.

Planning and research

My biggest challenge was to figure out how to redirect all of the traffic from comeacross.info to raoulpop.com, reliably and accurately. I needed to make sure that every one of my articles and posts would redirect to my new domain automatically, so that a URL like

http://comeacross.info/2007/12/30/my-photographic-portfolio/

would automatically change to

http://raoulpop.com/2007/12/30/my-photographic-portfolio/

and the redirect would work in such a way that search engines would be properly notified and I wouldn’t lose my page rank.

I knew about 301 redirects, but I wasn’t sure how to accomplish them in the Linux/WordPress environment the way that I wanted them to work. I had worked mainly with Microsoft web servers until recent times, and Linux was and still is fairly new to me. I was using John Godley’s Redirection plugin for WP (it’s an awesome plugin btw), and I knew it could do 301 redirects quite nicely. I had been using it heavily when I changed post slugs or deleted/consolidated posts at ComeAcross.

I worked out a line of Regex code that I could use to create a site-wide redirection, I tested it and it worked fine. In case you’re wondering, you can easily test it by creating a 307 (temporary) redirection instead of a 301 (permanent) redirection. Here’s how to do it:

Create a new 301 redirection where the source URL is

/(.*)

and the target URL is

http://www.example.com/$1

Make sure you check the Regex box, add it, and you’re done.

Just to make sure, I contacted John Godley to confirm whether it was the best way to do things. He said that would certainly do the job, but there was a MUCH easier and faster way to do it, one that saves a lot of the overhead that comes into play when WP gets used. It works through the .htaccess file. He was kind enough to provide me with the code, which is reproduced below.

<IfModule mod_rewrite.c>

RewriteEngine On

RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

</IfModule>

Just paste that into your .htaccess file (remove all other code but make sure you back it up somewhere in case you need it), save it, upload it, and you’re done.

Don’t do anything yet though! Not before you’ve thoroughly backed up everything! Let me outline the steps for you, and keep in mind that I wanted to mirror all of my content from two separate WP sites using the same WP version, and to redirect from the first to the second. These two conditions have to be met in order for my advice to apply to your situation.

  1. Make sure both sites are on the latest and greatest version of WP, or at least they’re on the SAME version of WP
  2. Back up the database from the old domain
  3. Download all site files from the old domain
  4. Upload site files to new domain
  5. Restore database to new domain
  6. Make changes to .htaccess file as shown above
  7. Log into your new domain’s WP admin panel and change the site and blog URLs. Now you’re done! Check to make sure the redirection works properly and all of your content is there.

Upgrade your WP installs

The two sites have to be on the same version, or else things might not work as expected. Upgrade both sites to the latest and greatest, or at least make sure they’re on the SAME version before you do anything else. Go to WordPress, download and install the latest versions. There’s also an Automatic Upgrade plugin, but I haven’t tried it yet, so I can’t vouch for it.

BEFORE you do any sort of upgrade, you need to back up. Yes, you can’t get away from this… You’ll need to do two backups, one before you upgrade, and one after you upgrade, before you transfer the content.

Back up your content

This combines steps 2 and 3 listed above. Backing up your site files is easy. Use an FTP client to access the files on the web server and download them to your hard drive. I always keep a local copy of my site files. It just makes sense.

Backing up your database is a little more involved. Your database contains all of your site content (posts, links, comments, tags, categories, etc.) so you definitely don’t want to lose it. There are detailed instructions on backing up the database on the WordPress site. You can follow those, or you can go to your site’s Admin Panel >> Manage >> Export and download the WordPress WXR file, which you can import into your new site afterwards.

While this is great for backups, restores are another matter. I tried it and found that the import operation kept timing out at my web host. Given that I have thousands of posts, I didn’t want to sit there re-restoring the WXR file only to get a few posts done with every operation. I needed something quicker.

There is a plugin called WordPress Database Backup which lets you download a zipped SQL file of the database. You can use this to restore the database through the MySQL Admin Panel, if your webhost provides you access to it.

What I did was to simply point my new site install to my old database. This is a very handy and easy solution if you plan to host both sites with the same web host. But this still doesn’t excuse you from backing up the DB before you upgrade the WP install! :-)

Restore your content to the new site

This is a two-step process (see #4 and #5 above) and involves reversing the steps you took during the backups. You will now upload your site files to the new domain, and you will restore the database to the new domain as well. If you’re in my situation, where you’re using the same web host, you can simply point the wpconfig.php file on your new domain to the old database.

Make sure all your content is properly restored before going on to the next step!

Make changes to the .htaccess file

You will need to make sure you don’t touch the .htaccess file before you transfer it to your new domain. Only the .htaccess file on your old domain needs to change. Remember this, or you’ll be wondering what’s going on with the redirects afterwards…

Use the code I’ve given you above, in the Planning and Research section, to make changes to the .htaccess file on your old domain, after you’ve made absolutely sure that all of your content is now mirrored on the new domain. Once this is done, the redirects will occur automatically and seamlessly.

Final checks and tweaks

This is very important. Surf to your old URL. You should get re-directed to your new URL. Do a search in the search engines for content of yours that you know is easily found. Click on the search results and make sure the links get redirected to your new site. Because you’re using 301 redirects, the search engines will automatically change their search results to reflect the URL changes without affecting your page rank, so you shouldn’t lose any search engine traffic if you execute the content move correctly.

There are a few more things you’ll need to check:

If you’d like to make changes to your site feed (and I did), you’ll need to handle that properly. I use FeedBurner, and there are people that subscribe to my content via RSS or via email. I needed to transfer both groups of subscribers to my new feed seamlessly. The FeedBurner folks helped me do just that, and I didn’t lose a single subscriber during the move. I detailed that process in this post.

What about internal links? If you’ve blogged for a while, you’ll have linked to older posts of yours. Those link URLs now contain the old domain, and you’ll need to change all of them at some point, or you’ll risk making those links invalid if you should ever stop renewing your old domain. Fortunately, there’s a Search and Replace plugin for WP that lets you do just that. It works directly with the database, it’s very powerful, and it’s very fast. That means you have to be VERY careful when you use it, because there’s no undo button. You can easily mess up all of your content if you don’t know what you’re doing.

What I did was to replace all instances of “.raoulpop.com/” with “.raoulpop.com/“. That did the trick nicely. I then did a regular site search for all instances of ComeAcross and manually made any needed changes to those posts. (Here’s a thought: back up the DB before you start replacing anything. This way you can restore if something should go wrong.)

Finally, if you’re using the Google Sitemaps Generator plugin, you’ll want to make sure you manually rebuild your site map. You don’t want to have your old site information in the site map as Google and the other search engines start to crawl your new domain.

That’s about all I did for the site content transfer. It occupied half my New Year’s Eve night, but it was worth it. It’s quite a bit of work, but if you plan it out, it should only take you 4-5 hours or less to execute the transfer, depending on your familiarity with this sort of thing, and the speed of your internet connection (keep in mind that upload speeds are a LOT slower than download speeds on most broadband connections).

Given how much work is involved, I was a bit surprised to see Matthew Mullenweg (founding developer of WordPress) talk about doing his own switch to a new domain in “2 seconds“. I think what he referred to is the changes to the .htaccess file and the blog URLs, which are the fastest parts of the process. There is, however, quite a bit of work that needs to take place behind the scenes before those switches can get flipped. And I also believe (someone correct me if I’m wrong) that he pointed both domains to the same web files — in other words, re-used his existing WP install — so he bypassed a lot of the steps that are otherwise required.

Hope this proves helpful to someone!

About these ads

But what happens if you die?

Blood on the tracks

This is a bit of a rant, but a recent comment on one of my articles reminded of an argument I sometimes hear as a consultant. It goes something like this: “But what happens if you die?” I cringe when I hear it — not because I can’t defend it — because I find it silly.

Actually, it’s not really an argument or a question at all. It’s a symptom. It tells me that the person making it is feeling very insecure about the deal.

Here’s what I told a recent potential client when I was asked that question:

I understand the “drop dead” factor, and it’s something that my long-term clients and I talked about. The thing is, unless I drop dead while the project is in development, you’re fairly safe. Once the project is completed, another knowledgeable designer/developer can come in and pick up where I’ve left off. Even while the project is being developed, if I can’t continue for whatever reason, the work isn’t lost. It isn’t as if I write my code in some language that no one understands. A good coder should be able to understand what I’ve done and build on it.

And that’s the truth. I can’t see how that argument could possibly stand on its own feet. If you’re a good developer, are in communication with the client, you back up your work, and you have certain deliverables and a timeline tied to a project, how can the project just disappear if you should kick the bucket? Makes no sense to me. Even if I should die, my computer will still be there. My wife or my friends will be there. My source code should be there. Besides, if it’s a website, chances are I’m working on a server somewhere as well, not just in my home, so the files can be retrieved even if my computer were to crash or be locked down.

Isn’t it individuals that have driven innovation throughout the ages? It’s people doing the work and driving toward goals, people that could croak at any point, I suppose, not machines. If the same “what if” argument to them, where would we be today? If a company looking to hire someone stops to think, what if he or she dies tomorrow, where will they be? If you find a good product or a good man, do you wait a few years to see whether or not that product will disappear or that person will croak? You have to take some risk if you want to see results, and sometimes the opportunities are there only for short amounts of time.

Better video

I’ve wanted to be able to post the videos I upload to Vimeo on my blog for some time, but the WP video plugins just hadn’t caught up. I’m glad to say that I found one tonight. It’s called, appropriately enough, WordPress Video Plugin. It’ll work just great for most people, so I encourage you to try it out.

I wanted to take advantage of the full width of my blog’s content column, so I modified the Vimeo code to make sure that my videos get sized to a width of 550 pixels and also stay centered.

I’m happy to say that I really like the results. You can see the modified plugin in action on these three posts:

Since I record my videos at a resolution of 640×480 pixels, it’s only natural that I display them at the maximum width possible on my site, right?

Catching a code injection hacker in the act

Several days ago, I installed the Redirection plugin from Urban Giraffe. It’s truly awesome, in more ways than one. John Godley, you are an amazing programmer! As I re-arranged the categories on my blog, I tracked the 404 errors through the plugin. On Saturday morning, I noticed the following bit of information in my log:

You can click on the thumbnail to view the screenshot at full size. Look at the entries for IP address 65.90.251.169. Notice something peculiar? That’s a hacker trying to inject malicious code into my pages. He was trying to call to code contained in a text file by the name ide.txt located on a possibly compromised domain.

First, I checked out his domain, new-fields.com. It looked legitimate. The text file was another story altogether. Have a look at the screenshots above. I also saved the code to my computer in case it ends up disappearing from the hacker’s website.

I tested the code, and it looks like some pages from the podPress plugin are targeted or affected — at least that’s what the error message given by WP referenced when I ran the code. I had that plugin enabled at the time, and I’ve disabled it since. It seems that the code tries to modify one of the header.php pages, along with checking disk space (?). So I thought, let me find out who this hacker is. Apparently, he’s from Napperville, IL, US, or at least that’s where his IP address lives.

What’s more, I thought it’d be interesting to see who owns that domain name where his text file resides. It turns out to be one Samir Farajallah from Dubai.

So what we’ve got so far is some dude in Dubai who owns the domain where the malicious code resides, and some hacker in Napperville, IL, trying to exploit my blog using that malicious code.

Wait, it gets better… On Saturday evening, I have another look at my blog’s 404 log, and I find that some other hacker from Vietnam (IP address: 203.171.31.19) is trying to hack into my blog using that exact same code, but this time the text file’s located on some domain in Argentina. That last link leads directly to the text file with the malicious code, but it’s harmless if you browse it. It only works if you run it as PHP code, like these hackers are trying to do.

So far, it looks like I’ve got two hackers, who may or may not be working together, using the same malicious code, located on two different, possibly compromised domains, and trying to modify my header files, possibly to insert code in there that will display splog content or some other stuff.

Update: It looks like three more hackers are trying their luck today, on Sunday morning, 9/30/07. Their IP addresses are 65.98.14.194, 66.79.165.19 and 66.11.231.48.

What I can tell you is that they haven’t been successful. I checked all of my files, and none of them have been touched. Everything’s fine. At this point, I’m not going to waste any more of my time trying to hunt them down. If I see that the attacks continue, I’ll notify my web hosting provider, along with the hosting providers of the other domains, and I’ll also notify the ISPs who own the IP addresses used in the attacks.

My thanks go out to John Godley for the wonderful Redirection plugin. I wouldn’t have been able to catch these hackers without it. I don’t often check my 404 log files, although I should.

I’ve been working in IT for 13 years or so. Maybe I’m naive, maybe I’m too honest for my own good, but I’ve stayed away from this hacking business, and I’ll continue to do so. It’s just not a sustainable lifestyle. I believe that the bad stuff you do in life will catch up with you sooner or later. It’s inevitable. These hackers will get what’s coming to them, and I won’t even have to lift a finger beyond what I’ve done so far.

Flickr tightens up image security

Given my concern with image theft, I do not like to hear about Flickr hacks. A while back, a Flickr hack circulated around that allowed people to view an image’s full size even if the photographer didn’t allow it (provided the image was uploaded at high resolution.) The hack was based on Flickr’s standard URL structure for both pages and image file names, and allowed people to get at the original sizes in two ways. It was so easy to use, and the security hole was so big, that I was shocked Flickr didn’t take care of it as soon as the hack started to make the rounds.

It’s been a few months now, and I’m glad to say the hack no longer works. I’m not sure exactly when they fixed it. Since it’s no longer functional, I might as well tell you how it worked, and how they fixed it.

D

First, let’s look at a page’s URL structure. Take this photo of mine (reproduced above). The URL for the Medium size (the same size that gets displayed on the photo page) is:

http://flickr.com/photo_zoom.gne?id=511744735&size=m

Notice the last URL parameter: size=m. The URL for the Original size is the same, except for that last parameter, which changes to size=o. That makes the URL for the original photo size:

http://flickr.com/photo_zoom.gne?id=511744735&size=o

Thankfully, that no longer works. If the photographer disallows the availability of sizes larger than Medium (500px wide), then you get an error that says something like “This page is private…”

Second, they’ve randomized the actual file names. So although that image of mine is number 511744735, and it stands to reason that I would be able to access the file by typing in something like http://farm1.static.flickr.com/231/511744735_o.jpg, that’s just not the case. Each file name is made up of that sequential number, plus a random component made up of letters and numbers, plus the size indicator. So the actual path to the medium size of the image file is:

http://farm1.static.flickr.com/231/511744735_b873d33b12_m.jpg

This may lead you to think that if you can get that random component from the URLs of the smaller sizes, you can then apply the same URL structure to get at the larger size, but this is also not the case. It turns out that Flickr randomizes that middle part again for the original size. So although it stays the same for all sizes up to 1024×768, it’s different for the original. For example, the URL for the original size of that same photo is:

http://farm1.static.flickr.com/231/511744735_d3eb0edf2d_o.jpg

This means that even if you go to the trouble of getting the file name for one of the smaller sizes, you cannot guess the file name of the original photo, and this is great news for photographers worried about image theft.

While I’m writing about this, let me not forget about spaceball.gif, the transparent GIF file that gets placed over an image to discourage downloads. It can be circumvented by going to View >> Source and looking at the code to find the URL for the medium-size image file. It’s painful, but it can be done, and I understand there are some scripts that do it automatically. The cool thing is that after Flickr randomized the file names, it became next to impossible to guess the URL for a file’s original size. The best image size that someone can get is 1024×768, which might be enough for a 4×6 print, and can probably be blown up with special apps to a larger size, but still, it’s not the original.

Perhaps it would be even better to randomize the file name for the large size as well, so that it’s different from the smaller sizes and the original size. That would definitely take care of the problem. Still, this is a big step in the right direction.

The best WordPress plugins

On Christmas day, I gave thanks for the four top technology solutions that impacted my life in 2006. First on my list was WordPress, which is as far as I’m concerned the best personal publishing platform. The beauty of WordPress is that it’s almost infinitely extensible through plugins. If there’s a feature that’s not standard in WordPress, chances are someone’s written a plugin that fills that need. Here are the plugins I found most useful during this past year, listed in alphabetical order:

  • Akismet: hands down, the best anti-spam plugin there is. I wish Akismet made junk email filters as well. When coupled with manual approval of new comments, virtually nothing gets by it. It filters out 99.9% of the spam comments, and leaves a few in the moderation queue for me to check. Since I started ComeAcross in April of 2006, it’s filtered out almost 17,000 spam comments. None of those reached the blog. Only meaningful comments written by real people made it to the live site — and that’s a beautiful thing.
  • FeedBurner Feed Replacement: it allows me to present the FeedBurner feed for ComeAcross as the standard feed that gets shown to browsers and feed readers when they visit my site. That means people don’t subscribe to the real site feed, which might change, but to the FeedBurner feed, which stays the same and is enriched with all sorts of goodies.
  • Filosofo Home-Page Control: lets me set a particular page as the home page, and also to separate the blog to a subdirectory, even if it’s at the root level. At first, you don’t get the point, until you realize you can use WordPress to run a regular site by creating pages, then add a blog to a subdirectory later and specify that subdirectory through this plugin. In other words, you run the site pages at http://www.example.com and the blog at http://www.example.com/blog. Really, really nice.
  • Google Sitemaps: I get my lion’s share of traffic from Google, and I’m truly grateful for that. My content gets ranked toward the top in Google search results on many topics. It goes to show that quality content will make it to the top no matter if it’s produced by one person or many people. So anything that will tell the good folks at Google when I publish or change my content is at the top of my list. Imagine my joy when I found that someone put together a beauty of a plugin for WordPress that creates a Google Sitemap of all my site content and pings Google whenever I add or change that content! I was ecstatic, and I still am!
  • inlineRSS: this little plugin allows inline display of RSS feeds from virtually any source. I actually used it to display my del.icio.us bookmarks for a while, but it works with YouTube feeds as well. It won’t display photos or videos (although if you’re brave, you can tweak the XSLT file for those purposes), but if you’re just looking for a simple list of links to your latest and greatest feed items, it’ll do the trick just fine.
  • No Ping Wait: boy, oh boy did I need this plugin after I started using WordPress seriously! Because I set up WordPress to ping several services when I published a new post, the publishing process became unmanageably slow. Any hiccups in reaching a service would cause a delay when saving a post, and possibly bring everything to a stop. Well, that was no way to run a site! With this plugin, pinging is delegated to a separate process and whether it fails or not, it doesn’t affect the publishing of content. After a quick and painless install, my site ran smoothly once more, and I was grateful for it!
  • WordPress Database Backup: talk about a lifesaver! Yes, this plugin, along with Akismet, ships packaged in with WordPress, but even if it didn’t, I’d download it and install it in a heartbeat! It backs up the site database (where all of the posts, pages and comments are stored), compresses it, and either puts it in a backup directory on the server, lets you download it to your computer, or emails it to you! How cool is that! That means you and I can do periodic backups of the site, and restore from them in case anything should happen. I absolutely love it!
  • WP-Contact Form: this plugin lets you easily add a contact form to your WordPress site. Just create a contact page, paste the snippet of code that calls up the contact form code, and you’re set: you get instant functionality that your site visitors will love!
  • WPVideo: I’ve saved the best for last! This is, hands down, no contest, the easiest video plugin for WordPress! It should be packaged together with WordPress and shipped out as a standard config, that’s how easy it is to use! After it’s installed, you just tag any YouTube, Google Video or MetaCafe video link with a simple snippet, and that’s all you need do! The video automatically displays, and you can configure the display of additional data such as video title, duration, and even a download link. I’ve seen some video plugins for WordPress require you to paste special codes from the video URLs, or to use arcane tags and make ridiculous changes to core WordPress template files, but not WPVideo! No, this is the easiest video plugin for WordPress, I guarantee it!

Well, there you have it, folks! If you use WordPress and you don’t already use these plugins, by all means, give them a try, they’ll make your life a whole lot easier! As for me, by way of this post, I’m sending a big, hearty Thank You and Happy New Year to the developers who worked on these plugins. May you wonderful people have a blessed year ahead! I can’t thank you enough for your work! Oh, and if I haven’t mentioned other WordPress plugins that are doing wonders for people, I apologize. Here’s a list of all of the WP plugins. Take your pick!

Update on Microsoft Expression Web Designer

It appears that Web Designer is part of a suite of apps that has yet to launch, called Microsoft Expression. It will contain three apps: Graphic Designer, Interactive Designer and Web Designer. Graphic Designer will be a marriage (in MS fashion) of Fireworks, Illustrator and Photoshop (we’ll see how well that comes out), Interactive Designer will be a UI design/destop app tool (it integrates seamlessly with Visual Studio), and Web Designer will of course go after Dreamwever, as detailed before, emphasizing the MS coding platforms (ASP, ASP.NET).

Microsoft Expression Graphic DesignerMicrosoft Expression Interactive DesignerMicrosoft Expression Web Designer
Graphic Designer and Interactive Designer are still in community edition (read flaky), and it looks like Interactive Designer will only work with .NET Framework 3.0 plus Visual Studio Express (at least). Web Designer is out in Beta and ready for download and use.

I have to ponder MS’ reach on this. They’re clearly building upon their strengths and going after their competitors, which is what they’ve always done, but to go after Photoshop and Dreamweaver is pretty lofty. Only time (and users) will tell whether they’ve managed to reach the target, or, in usual MS fashion, delivered something half-baked. Now we begin to see where all that R&D money went — it didn’t just go to Vista, it also went to stuff like this.