Exporting posts from posterous

Posted   on Friday, September 17 - 2010 (594 words)

Posterous is awesome for great looking blogs without the hassle. It was great to have it for posting my thoughts when my old blog was all messed up. Now that I’ve redesigned it I want to host all my old posts on my new blog, too. Therefore I needed to export them somehow. It’s a bit strange that the guys from posterous don’t offer an export button on the configuration page but they help you out with a nice RESTful API that can be used with just about any language.

Here is how a response to a query looks like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
<post> 
  <url>http://post.ly/NWNC</url> 
  <link>http://matthiasendler.posterous.com/.../link> 
  <title>Overkill: Java as a first programming language</title> 
  <id>11579710</id> 
  <body> 
    <![CDATA[><p>I recently...]]>
  </body> 
  <date>Fri, 12 Feb 2010 08:12:00 -0800</date> 
  <views>505</views> 
  <private>false</private> 
  <author>Matthias Endler</author> 
  <authorpic>http://files.posterous.com/.../unix_thumb.jpg</authorpic> 
  <commentsenabled>true</commentsenabled> 
  <commentsCount>0</commentsCount> 
</post>

Grab your posts

So off we go with this little PHP snippet that gives you ten of my latest blog posts:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<?php

  // Posterous PHP export. Requires PHP 5.1.0 or newer
  // Documentation on http://posterous.com/api/reading

  // Set timezone for posterous blog
  date_default_timezone_set('America/Los_Angeles');

  // Query options
  $site_id = "";                // Either use id of the site...
  $hostname = "matthiasendler"; // ...or a posterous subdomain.
  $num_posts = 10;                  // Number of posts to read.
  $page = 1;                              // Get specific page.
  $tag = "";                     // Grab posts with these tags.

  // Create query string
  if ($site_id) $query = "site_id="  . $site_id;
  else          $query = "hostname=" . $hostname;

  if ($num_posts) $query .= "&num_posts=" . $num_posts;
  if ($page)      $query .= "&page="      . $page;
  if ($tag)       $query .= "&tag="       . $tag;

  // Request
  $url = "http://posterous.com/api/readposts?" . $query;
  $xml = simplexml_load_file($url);
  $post_count = count($xml->post);

  // Cumbersome way to walk post entries. 
  // See php.net/manual/de/function.simplexml-load-file.php#86471
  for($i = 0; $i < $post_count; $i++) { 
    $post = $xml->post[$i]; 
    // Now we can use $post->title, $post->date, $post->body...
    write_markdown($post, "post");
  }

?>

Put it into jekyll

I wanted to import the posts into jekyll so I used the function write_markdown() to do so.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
<?php

/**
 * Formats a posterous post as markdown to use it with jekyll
 * array  $post   The post that gets processed
 * string $layout The liquid template to use 
 */
 function write_markdown($post, $layout) {
	// Extract relevant data from $post
	$date  = date("Y-m-d", strtotime($post->date));
	
	// Create filename
	$extension = ".mkdn";
	$filename = $date . "-" . strtolower($post->title);
	
	// Remove whitespace and special characters from filename
	$patterns = array('/\s/', '/:/','/\(/','/\)/', '/]/', 
	'/\[/', '/!/', '/\?/', '/\+/', '/=/','/\./', '/\"/', '/,/');
	$filename = preg_replace($patterns, '-', $filename);
	
	// Replace multiple dashes with single dash
	$filename = preg_replace('/--+/', '-', $filename);
	
	// Remove possible dash at the end of filename
	$filename = trim($filename, '-');
	
	// We're good to go. Create a new markdown file
	$jekyll_post = fopen($filename . $extension, 'w');
	if ($jekyll_post) {
		// Jekyll requires a yaml front matter
		$yaml = "---\n" .
			"layout: " . $layout       . "\n" .
			"title:  " . $post->title  . "\n" .
			"---\n";
				
		// Write front matter, post data and close file
		fwrite($jekyll_post, $yaml);
		fwrite($jekyll_post, $post->body);
		fclose($jekyll_post);
	}
 }

?>

It’s generally a good idea to use an HTML to Markdown converter for the post body. There is a nice one for PHP called Markdownify and a python script named html2text.

Download

You can download the whole script here. Simply execute it by typing php grab.php.

Download the package.

Errata

Thanks to Algert Sula for pointing out an error in the above script.

More? Here's a list of all posts.
Feel free to copy, share and comment.

Tweet