2718.us blog » iziblog http://2718.us/blog Miscellaneous Technological Geekery Tue, 18 May 2010 02:42:55 +0000 en hourly 1 http://wordpress.org/?v=3.0.4 An Overhaul of LJ-Stat http://2718.us/blog/2008/10/12/an-overhaul-of-lj-stat/ http://2718.us/blog/2008/10/12/an-overhaul-of-lj-stat/#comments Sun, 12 Oct 2008 12:03:44 +0000 2718.us http://2718.us/blog/?p=108 I’m currently working on an overhaul of LJ-Stat.

It looks like there’s some issue in using curl_multi_exec() in PHP with too many requests at once causing some requests to fail strangely, potentially accounting for the lack of data from several sites that are clearly not down and clearly provide stats.txt.  My current workaround is to do the requests in smaller blocks.

I’m also trying to provide more detail as to why there aren’t stats for the sites that don’t have stats.

But the biggest development is that there will probably be graphs of the data over time.  I say “probably” because while the code is pretty much written, I’ve only been storing historical data for about a day so far (in the past, only the most recent data was kept), so it’s hard to tell whether the graphs will look okay with a lot of data and whether producing the graphs will put a significant load on the server.  The data will probably update more regularly and more frequently–likely noon and midnight CT.

Also, if anyone knows for sure if Bloty, IziBlog, and/or LiveLogCity are still alive or definitively dead, I’d like to know.  Oh, and CommieJournal seems to be looking at the posibility of moving to a different codebase, though I can’t for the life of me see why anyone would want to try to move thousands of accounts from the LJ codebase to something incompatible and with a different working paradigm.

]]>
http://2718.us/blog/2008/10/12/an-overhaul-of-lj-stat/feed/ 0
Limitations of lj-stat http://2718.us/blog/2008/04/13/limitations-of-lj-stat/ http://2718.us/blog/2008/04/13/limitations-of-lj-stat/#comments Sun, 13 Apr 2008 20:23:47 +0000 2718.us http://2718.us/blog/?p=16 To the best of my knowledge and research, my LJ-code-base Site Statistics page (lj-stat) has the most comprehensive list of sites running off of LiveJournal’s codebase (if you know of any that I’ve missed, please let me know).  The main point, though, is the comparative statistics.  This is where things get strange.  LJ and most of the sites provide a pretty statistics page at /stats.bml and in most (or all?) instances, stats.bml says at the top (this is from LJ itself)

Raw data can be picked up here.

where “here” links to /stats/stats.txt.  On at least one site, stats.bml has this text, but stats/stats.txt returns a 404.  On at least one site, both stats.bml and stats/stats.txt return 404.  Since it looks to me like the whole point of providing stats.txt was to provide a more machine-readable set of stats that didn’t require loading a full web page and screen-scraping, I have no intention of trying to screen-scrape the info I want.

Now, to make things even stranger, some sites are missing what I’d call “key” stats from their stats.txt files.  In particular, the one I care most about is the “active in some way in the past 30 days” measure since I think that’s the best measure of the vitality of a site (well, either that, or what portion of the total userbase it represents).  Stranger still is that some sites report numbers in stats.txt that not only don’t match stats.bml, but make no sense whatsoever (DeadJournal perpetually reports only 10 accounts updating in the past 30 days, even though stats.bml has more sensible numbers).

Unrelated to the content of stats.txt is the “Speed Index” column–based on the rate of transfer reported by libcurl when retrieving stats.txt, where the speed index of a site is given as the percentage of the fastest transfer rate.  What I don’t quite understand is how InsaneJournal is always at least twice as speedy as any other site, often at least 4x or 6x the speed.  It actually made me wonder if my server and theirs were somehow in the same datacenter or something, but there are at least a dozen hops between us (which is more than from my server to some other LJ-based sites), so maybe it does have something to do with the servers themselves and not just network conditions.

Please let me know if you have any suggestions about enhancements to lj-stat.  Also, feel free to try to convince the sites that don’t provide stats.txt to start providing it and to try to get sites where the numbers are clearly wrong to try to fix it.

]]>
http://2718.us/blog/2008/04/13/limitations-of-lj-stat/feed/ 0