2718.us blog » python http://2718.us/blog Miscellaneous Technological Geekery Tue, 18 May 2010 02:42:55 +0000 en hourly 1 http://wordpress.org/?v=3.0.4 Reposting from the AML TempSite http://2718.us/blog/2009/09/06/reposting-from-the-aml-tempsite/ http://2718.us/blog/2009/09/06/reposting-from-the-aml-tempsite/#comments Mon, 07 Sep 2009 01:43:41 +0000 2718.us http://2718.us/blog/?p=172 This is not likely to be of interest to many people, but for anyone who used uJournal (uJ) or AboutMyLife (AML), which absorbed uJ after its demise, it is worth knowing that there has been a temporary site up at http://aboutmylife.net/tempsite/ at which one can get a very bare dump of their entire journal.  For those interested, it may also be of interest to take all those entries and post them into one’s current journal.  Here is a process for doing that.

THIS INFORMATION IS PROVIDED AS-IS WITH NO EXPRESS OR IMPLIED WARRANTY. USE AT YOUR OWN RISK. It worked for me, but who knows what that may mean for you.

Requires: Python v2.something (maybe 2.4?)–Mac OS X 10.4 works fine, as will most current linux/unix things, I think.

  1. Go to the AML tempsite, log in, and save the file that shows up (which is all your entries, but totally lacking formatting, etc.) as “entries.html”
  2. Download pyLJxmlrpc.py from Google Code (I just put it there; I wrote it), save it in the same directory as entries.html
  3. Copy/paste the following into a file (I called it “processEntries.py” but it doesn’t really matter), and change the USERNAME and PASSWORD to the username and password of the account to which you want to post (you can also change “www.livejournal.com” to other journal sites–it should work on any LJ site that supports the XML-RPC protocol). line wrapping and whitespace are important
    
    #!/usr/bin/python
    
    import re
    
    f = open('entries.html')
    s = f.read()
    a = s.split('</td></tr><tr></tr><tr><td width="25%">')
    r = re.compile(r'([0-9]{4})-([0-9]{2})-([0-9]{2}) ([0-9]{2}):([0-9]{2}):[0-9]{2}</td><td width="75%">(.*)</td></tr><tr><td> </td><td>(.*)',re.DOTALL)
    
    processedEntries = {}
    for e in a:
        m = r.search(e)
        t = "%s-%s-%s %s:%s" % (m.group(1), m.group(2), m.group(3), m.group(4), m.group(5))
        processedEntries[t] = {'year':m.group(1), 'mon':m.group(2), 'day':m.group(3), 'hour':m.group(4), 'min':m.group(5), 'subject':m.group(6), 'body':m.group(7)}
    
    sk = processedEntries.keys()
    sk.sort()
    
    import pyLJxmlrpc
    
    lj = pyLJxmlrpc.pyLJxmlrpc()
    
    for k in sk:
        lj.call_withParams_atURL_forUser_withPassword_('postevent',{'event':processedEntries[k]['body'],'linenedings':'unix','subject':processedEntries[k]['subject'],'security':'private','year':processedEntries[k]['year'],'mon':processedEntries[k]['mon'],'day':processedEntries[k]['day'],'hour':processedEntries[k]['hour'],'min':processedEntries[k]['min'],'props':{'opt_backdated':True,'taglist':'aml-raw'}},'http://www.livejournal.com/interface/xmlrpc/','USERNAME','PASSWORD')
        print "%s: %s" % (k,processedEntries[k]['subject'])
    
  4. At a command prompt (Mac: run Terminal), change to the directory in which you saved the two .py files and entries.html, and run
    python processEntries.py

    and watch it go–it’ll only take a few seconds to pull apart the HTML file, but reposting entries takes time; it prints the date/subject of each entry *after* attempting to post it, so errors you might see pertain to the date/subject immediately after the error.

Every entry from AML that didn’t have an empty body will be posted with its date-time maintained, set to private, and backdated; you will see error messages for any entries that were blank (since the AML tempsite thing strips out all HTML, this left me with some blank entries where meme/quiz results had been).

]]>
http://2718.us/blog/2009/09/06/reposting-from-the-aml-tempsite/feed/ 0
XML-RPC and Mac Programming, Revisited http://2718.us/blog/2009/04/26/xml-rpc-and-mac-programming-revisited/ http://2718.us/blog/2009/04/26/xml-rpc-and-mac-programming-revisited/#comments Sun, 26 Apr 2009 05:59:02 +0000 2718.us http://2718.us/blog/?p=139 I might have been wrong, or at least not entirely right, when I said that AppleScript’s XML-RPC was doing something screwy with UTF8-encoded responses to XML-RPC requests.  I’m not sure if it’s LiveJournal (and other sites based on their code), or if it’s something inherent in XML-RPC, but whether I make the XML-RPC calls in AppleScript (with its built-in mechanism for calling XML-RPC), in Python (with xmlrpclib), or in Objective-C/Cocoa (using the XML-RPC framework from here), things that I was expecting to be UTF8 strings were instead coming through as binary data that needed to be decoded.

Beyond that point, however, AppleScript was severely lacking in that the form in which that data was stored made it entirely unusable–AppleScript couldn’t convert it, couldn’t pass it off to an Objective-C method, etc.  As suggested in my previous post, there was a way around it, and messy though it was, I went about implementing that fix and by and large it worked (though it exposed another minor bug elsewhere).  But it really bothered me.

So I went back to looking at trying to integrate Python code into my tangled web of AppleScript and Objective-C, since XML-RPC is fairly easy in Python, though not quite as easy as in AppleScript.  And, eventually, I succeeded in integrating a class written in Python into the program (documentation on using the PyObjC bridge in this direction is woefully inadequate), using a less ineligant means of fixing the binary UTF8 data—

unicode(theResult.data,'utf-8')

(and Python also allowed me to generically recurse through the entire return structure, which wasn’t possible in Applescript).  Unfortunately, this version was substantially slower than the broken-Unicode version and not particularly any faster (perhaps slower) than the AppleScript-fixed Unicode version.

This led me to look for a way to do the XML-RPC stuff in Objective-C.  Now, mind you, the single thing that enabled me to even think about writing a client for LJ for Mac was seeing just how easy AppleScript XML-RPC calls were.  While I didn’t particularly want to try Python, the XML-RPC calls there weren’t that much harder.  But going to Objective-C for XML-RPC…  that’s a fundamental change in the program.  At least, to me.

I did a lot of Googling and found that there are actually a few XML-RPC frameworks for Objective-C/Cocoa (the one I used by Eric Czarny, the one from Brent Simmons, the Mulle one, XMLRPCObjC, SOPE).  Supposedly, there’s a way to do it with Apple’s own Cocoa stuff, but the documentation is woefully inadequate (none of the frameworks have amazing and wonderful documentation, but Apple’s documentation is bad) and almost every mention of it that I found on mailing lists and discussion boards said it was broken.  In the end, my framework choice was largely dictated by licensing, though there were also some issues with usability and dependencies.  As with AppleScript and Python, the UTF8 strings weren’t coming through as strings, but as NSData objects, which are fairly easy to convert with

[[NSString alloc] initWithData:theObject encoding:NSUTF8StringEncoding]

Recursing through the entire returned structure wasn’t particularly any harder in Objective-C than in Python.

The best part is that the resulting client with Objective-C-based-XML-RPC feels faster than the non-Unicode AppleScript-based-XML-RPC client.  In vaguely-objective tests (determine a set of steps that constitute a test and record the total time for just the XML-RPC calls in those steps, run the test several times under each app, compare times), the new version is measurably faster than the old version.

Bottom lines: (1) expect a new version of asLJ in the next few days, as soon as I get feedback from my early testers; (2) expect another post or two about other things I’ve learned in rewriting the XML-RPC aspect of asLJ in Objective-C.

]]>
http://2718.us/blog/2009/04/26/xml-rpc-and-mac-programming-revisited/feed/ 0