I was going to name this post Vim Golf, but it turned out to be more of a regex experiment than vim keystrokes.
So I found this cool ifttt recipe that logs all of your tweets to a dropbox file. It puts them all in a single file with a format of:
At a drive in. Wat? @ Valle Drive-In http://t.co/Z9lz7O5F
Jun 30, 2012
http://twitter.com/darrinholst/status/218878382123909120
- - - - -
I see blurry apps
Jun 30, 2012
http://twitter.com/darrinholst/status/219074474073538560
- - - - -
Since the twitter api only allows you to get to 3200 of your tweets I though it would be a good idea to get the rest of the tweets that are accessible to me in there as well. The quickest way I know to get my latest tweets is at allmytweets.net. Allmytweets will pull down all your tweets and then show them on their page. The html ends up looking like:
8"><img src="css/extlink.png"></a></li><li>Straw in the wro
ng hole. @ High Life Lounge <a href="http://t.co/kdMfTvPk">
http://t.co/kdMfTvPk</a> <span class="created_at">Jun 12, 20
12</span> <a href="https://twitter.com/#!/darrinholst/status
/212644435689881600"><img src="css/extlink.png"></a></li><li
>I can see @dwolla from here! <span class="created_at">Jun 1
2, 2012</span> <a href="https://twitter.com/#!/darrinholst/s
tatus/212639918541910016"><img src="css/extlink.png"></a></l
i><li>Simplify Design With Zero, One, Many rules <a href="h
ttp://t.co/0bvdVzzo">http://t.co/0bvdVzzo</a> <span class="c
All the data is there, but not very useful for a txt file. Here’s the commands I threw at it to format it up:
Split li tags up to separate lines
%s/<li>/\r<li>/g
Preserve new lines in the tweets with a token
%s/\n/##NL##/g
Change new line tokens in between li tags back to new lines
%s/<\/li>##NL##<li>/<\/li>\r<li>/g
Reverse the order of the tweets, I want them in chronological order (The only non-replace command)
g/^/m0
Remove li start tags
%s/<li>//g
Replace li end tags with a separator
%s/<\/li>/\r- - - - -\r/g
Turn new line tags back into new lines
%s/##NL##/\r/g
Get rid of the start span tag for the date
%s/<span class="created_at">/\r/g
Get rid of the end span tag and tweet link start tag. Also change from https to http
%s/<\/span> <a href="https/\rhttp/g
Get rid of those stupid #!s
%s/\/#!\//\//g
Get rid of the image from allmytweets.net
%s/"><img src="css\/extlink.png"><\/a>//g
HTML decode
%s/&/\&/g
%s/</</g
%s/>/>/g
%s/ / /g
Get rid of remaining html
%s/<a href="//g
%s/">.*<\/a>//g
Clean up trailing spaces
%s/ *$//g
EDITING TEXT IS FUHHHH UHN! I wish I would have though of this 3800 tweets ago though.