Mastodon Archive To VimWiki

It’s about this time of year I like to check my backups and download my archives.
One archive I download is the archive of my Mastodon posts. Pretty much the only one now I’ve left the corporate web really.
I also like to copy the contents of my public fediverse posts into my own diary within my vimwiki.
Keep it all in one place for easy and local search.
Here’s the script I use, it’s very short and just copies the content of every post in the archive into a new diary entry in the vimwiki diary.
If it finds something already there, it appends.
It checks if it’s already written this post into the diary to avoid duplicating it when you run it over and every again every month or year or whatever.
Paste it into a new text-file called toVimWiki.php, download and unzip your mastodon archive, and run the script with php, passing it the path to the archive’s outbox.json and the root diary directory.
My diary is honestly mostly just public posts these days. Ain’t much in it I won’t blab about on the internet for likes and lols.
<?php
/**
* A quick script to just copy contents of mastodon
* archive outbox to the vimwiki diary. Unzip the
* archive, then run this script passing it the
* path to the outbox.json in that archive and
* the vimwiki directory containing your diary
*/
if(sizeof($argv)<=1){
print("toVimWiki: Add entries from Mastodon archive to a Vimwiki diary.\n");
print("We will append to any existing entries,\n");
print("And check if it's already there to avoid duplicates\n");
print("Usage: php toVimWiki.php path/to/unzipped/mastodon/outbox.json path/to/diaryfolder/vimwiki/diary/\n");
exit(1);
}
$fname="./outbox.json";
if(sizeof($argv)>1){
$fname=$argv[1];
if((!file_exists($fname))||(is_dir($fname))){
print("Can't find the outbox json file $fname");
exit(1);
}
}
$vimWiki = "~/vimwiki/diary/";
if(sizeof($argv)>2){
$vimWiki=$argv[2];
}
print("Loading $fname\n");
$contents = file_get_contents($fname);
$messages = json_decode($contents,true);
$threads = [];
$replies = [];
$dates = [];
foreach($messages["orderedItems"] as $m){
if(isset($m['object']['content'])){
if(isset($m['object']['inReplyTo'])){
$id = $m['object']['inReplyTo'];
if(isset($threads[$id])){
$threads[$id].="\n\n---\n\n".$m['object']['content'];
}else if(isset($replies[$id])){
if(isset($replies[$id])){
$id = $replies[$id];
if(isset($threads[$id])){
$threads[$id].="\n\n---\n\n".$m['object']['content'];
}
}
}
$rid = $m['object']['id'];
$replies[$rid] = $id;
}else{
$id = $m['object']['id'];
$threads[$id]=$m['object']['content'];
$dates[$id] = substr($m['object']['published'],0,10);
}
}
}
print("Gathered ".sizeof($threads)." threads\n");
foreach($threads as $id=>$post){
$post = str_replace('</p>', "</p>\n\n", $post);
$post = str_replace(''', '\'', $post);
$post= html_entity_decode(strip_tags($post));
$post = wordwrap($post,60,"\n");
$post.="\n\n[ $id ]\n\n";
$existing = "";
$fn = $vimWiki."/".$dates[$id].".wiki";
if(file_exists($fn)){
$existing = file_get_contents($fn);
}
if(!str_contains($existing,$id)){
$existing.="\n\n= Public Post =\n\n$post\n\n";
file_put_contents($fn,$existing);
print("Saved to $fn\n");
}else{
print("$id already in $fn\n");
}
}