2718.us blog » utf8 http://2718.us/blog Miscellaneous Technological Geekery Tue, 18 May 2010 02:42:55 +0000 en hourly 1 http://wordpress.org/?v=3.0.4 XML-RPC and Mac Programming, Revisited http://2718.us/blog/2009/04/26/xml-rpc-and-mac-programming-revisited/ http://2718.us/blog/2009/04/26/xml-rpc-and-mac-programming-revisited/#comments Sun, 26 Apr 2009 05:59:02 +0000 2718.us http://2718.us/blog/?p=139 I might have been wrong, or at least not entirely right, when I said that AppleScript’s XML-RPC was doing something screwy with UTF8-encoded responses to XML-RPC requests.  I’m not sure if it’s LiveJournal (and other sites based on their code), or if it’s something inherent in XML-RPC, but whether I make the XML-RPC calls in AppleScript (with its built-in mechanism for calling XML-RPC), in Python (with xmlrpclib), or in Objective-C/Cocoa (using the XML-RPC framework from here), things that I was expecting to be UTF8 strings were instead coming through as binary data that needed to be decoded.

Beyond that point, however, AppleScript was severely lacking in that the form in which that data was stored made it entirely unusable–AppleScript couldn’t convert it, couldn’t pass it off to an Objective-C method, etc.  As suggested in my previous post, there was a way around it, and messy though it was, I went about implementing that fix and by and large it worked (though it exposed another minor bug elsewhere).  But it really bothered me.

So I went back to looking at trying to integrate Python code into my tangled web of AppleScript and Objective-C, since XML-RPC is fairly easy in Python, though not quite as easy as in AppleScript.  And, eventually, I succeeded in integrating a class written in Python into the program (documentation on using the PyObjC bridge in this direction is woefully inadequate), using a less ineligant means of fixing the binary UTF8 data—

unicode(theResult.data,'utf-8')

(and Python also allowed me to generically recurse through the entire return structure, which wasn’t possible in Applescript).  Unfortunately, this version was substantially slower than the broken-Unicode version and not particularly any faster (perhaps slower) than the AppleScript-fixed Unicode version.

This led me to look for a way to do the XML-RPC stuff in Objective-C.  Now, mind you, the single thing that enabled me to even think about writing a client for LJ for Mac was seeing just how easy AppleScript XML-RPC calls were.  While I didn’t particularly want to try Python, the XML-RPC calls there weren’t that much harder.  But going to Objective-C for XML-RPC…  that’s a fundamental change in the program.  At least, to me.

I did a lot of Googling and found that there are actually a few XML-RPC frameworks for Objective-C/Cocoa (the one I used by Eric Czarny, the one from Brent Simmons, the Mulle one, XMLRPCObjC, SOPE).  Supposedly, there’s a way to do it with Apple’s own Cocoa stuff, but the documentation is woefully inadequate (none of the frameworks have amazing and wonderful documentation, but Apple’s documentation is bad) and almost every mention of it that I found on mailing lists and discussion boards said it was broken.  In the end, my framework choice was largely dictated by licensing, though there were also some issues with usability and dependencies.  As with AppleScript and Python, the UTF8 strings weren’t coming through as strings, but as NSData objects, which are fairly easy to convert with

[[NSString alloc] initWithData:theObject encoding:NSUTF8StringEncoding]

Recursing through the entire returned structure wasn’t particularly any harder in Objective-C than in Python.

The best part is that the resulting client with Objective-C-based-XML-RPC feels faster than the non-Unicode AppleScript-based-XML-RPC client.  In vaguely-objective tests (determine a set of steps that constitute a test and record the total time for just the XML-RPC calls in those steps, run the test several times under each app, compare times), the new version is measurably faster than the old version.

Bottom lines: (1) expect a new version of asLJ in the next few days, as soon as I get feedback from my early testers; (2) expect another post or two about other things I’ve learned in rewriting the XML-RPC aspect of asLJ in Objective-C.

]]>
http://2718.us/blog/2009/04/26/xml-rpc-and-mac-programming-revisited/feed/ 0
AppleScript’s XML-RPC Doesn’t Get Along with UTF8 http://2718.us/blog/2009/02/14/applescripts-xml-rpc-doesnt-get-along-with-utf8/ http://2718.us/blog/2009/02/14/applescripts-xml-rpc-doesnt-get-along-with-utf8/#comments Sun, 15 Feb 2009 00:53:02 +0000 2718.us http://2718.us/blog/?p=131 While the ease of making XML-RPC calls in AppleScript is wonderful for, say, writing a LiveJournal Client in mostly AppleScript Studio, it seems to be doing something really messed up with UTF8 strings returned by the server—they come into AppleScript as raw data objects, which it seems can’t be cast into any other type and can’t be passed easily into a Cocoa method to convert them. The easiest way to properly decode them seems to be the following:

  1. if class of theReturnedValue is "data" then
  2.  try
  3.   (* this will fail on a data object and then we will pull the (hex) bytes out as text
  4.   and bring them back as a utf8 string object *)
  5.   theReturnedValue as text
  6.  on error errmess – extract the data from the error message
  7.   set bytesString to text ((offset of "«" in errmess) + 10) thru ((offset of "»" in errmess) - 1) of errmess
  8.   set theReturnedValue to (run script "«data utf8" & bytesString & "»")
  9.  end try
  10. end if

This checks the class of the returned value and, if it’s a raw data object, attempts to cast it as text which raises an error, then extracts the string of hexadecimal values from the error message and puts it into a proper UTF8 object, making everything happy again.

If anyone wants to tell me I’m wrong and there’s a simpler fix, I’d love to hear it, since this is essentially unworkable.

]]>
http://2718.us/blog/2009/02/14/applescripts-xml-rpc-doesnt-get-along-with-utf8/feed/ 1