[sword-devel] module statistics

DM Smith dmsmith555 at yahoo.com
Fri Jan 2 15:44:06 MST 2009


Peter,

My version is in ~dmsmith/bin.
It uses File::ReadBackwards, which it is able to find.
But running it, /var/log/xferlog cannot be read and the program dies.

In Him,
	DM

On Dec 29, 2008, at 5:58 AM, Peter von Kaehne wrote:

> Ok,
>
> Thanks to mwtalbert who made me aware of the exchange attached below.
>
> The script is still in ghellings files. It does not work as it  
> requires
> some Perl modules which appears not to be available.
>
> I would be extremely grateful if someone could look into what is  
> needed
> to make it work. I attach the error message below. Maybe it is  
> something
> simple.
>
> Peter
>
> -------------------------------
>
> error message:
>
> [refdoc at www ~]$ ./makeDownloadsStats.pl
> Can't locate File/ReadBackwards.pm in @INC (@INC contains:
> /usr/lib64/perl5/5.10.0/x86_64-linux-thread-multi /usr/lib/ 
> perl5/5.10.0
> /usr/local/lib64/perl5/site_perl/5.10.0/x86_64-linux-thread-multi
> /usr/local/lib/perl5/site_perl/5.10.0
> /usr/lib64/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi
> /usr/lib/perl5/vendor_perl/5.10.0 /usr/lib/perl5/vendor_perl
> /usr/local/lib/perl5/site_perl /usr/local/lib64/perl5/site_perl .) at
> ./makeDownloadsStats.pl line 8.
> BEGIN failed--compilation aborted at ./makeDownloadsStats.pl line 8.
>
> ---------------------------------
>
> Old email exchange
>
> Greg Hellings wrote:
>> Troy,
>>
>> I've gotten that update finished.  The file
>> ~ghellings/makeDownloadsStats.pl has the update.  I dumped a diff of
>> it to ~ghellings/makeDownloadsStats.diff.  The numbers it comes up
>> with are surprisingly higher than the current Top20 list on the site
>> (KJV: 7841, Total: 213281), but if you haven't had updates since
>> March, I suppose it's not THAT staggering.  Provided the internal
>> structure of the logfiles hasn't changed, the log reading should  
>> still
>> be accurate, since I didn't touch that - only the filename  
>> processing.
>> Still, if someone else wants to take a look at those edits and check
>> to see if it looks like it's running properly.  When I ran it, it
>> appeared to only spend time parsing the files applicable to the last
>> 30 days.
>>
>> I didn't remove any of the older code, I simply commented it out, so
>> if you want to maintain the cleanliness of the version you have in
>> version control, you might want to take out the commented lines  
>> before
>> committing the changes.
>>
>> --Greg
>>
>> On Fri, Aug 29, 2008 at 1:13 PM,  <greg.hellings at gmail.com> wrote:
>>> Troy,
>>>
>>> From the looks of that file, editing it to process the new log file
>>> naming scheme is almost as simple as pulling the directory listing
>>> rather than iterating over the file names with an integer counter.
>>> I'll finish the edit this afternoon when I next access a computer.
>>>
>>> Greg
>>>
>>> On 8/24/08, Troy A. Griffitts <scribe at crosswire.org> wrote:
>>>> Dear Greg,
>>>>
>>>> Thank you so much for your work.  Both you and DM had offered to  
>>>> help on
>>>> this.  As DM has a ton of other tasks, I'm sure he would  
>>>> appreciate it
>>>> if you wanted to own this.  Here is the history up to now.
>>>>
>>>> Originally, I believe Joachim, Chris, Martin, and DM had a hand in
>>>> creating, improving, debugging, etc., a perl script to do module
>>>> statistics.  I think they worked out a good way to minimize skewed
>>>> numbers from multiple retries, multiple files per modules, etc.   
>>>> I've
>>>> moved their script to ~sword/bin/ on the server and placed it under
>>>> version control.
>>>>
>>>> If you'd like to own this task moving forward, you are more than
>>>> welcome-- and I think I can say this for all those involved in the
>>>> process in the past (though they can speak up if they still have a
>>>> heartfelt attachment to the task).  However, so as not to neglect
>>>> gleaning from their past work, I would like to ask you to take a  
>>>> look at
>>>> their script and see how they decided to computer numbers.
>>>>
>>>> This script is run from a daily cron job to produce the  
>>>> top20.html file
>>>> on swords front page.  The arguments for the run are:
>>>>
>>>> /home/sword/bin/makeDownloadsStats.pl /home/sword/html/top20.html  
>>>> 20 30
>>>>
>>>> If your new python script could take the same params and generate a
>>>> similar file, it would make it easy for me to substitute it into  
>>>> the
>>>> cron job.
>>>>
>>>> If you don't feel this is something you'd like to own, maybe DM  
>>>> is still
>>>> willing to look into updating the current perl script.
>>>>
>>>> Thanks everyone for your recent work and work from the past on  
>>>> this.
>>>> Automation is our friend: it captures nebulous knowledge floating  
>>>> around
>>>> and places it into a solid description, and keeps humans out of  
>>>> the role
>>>> of 'bottleneck'. :)
>>>>
>>>>      -Troy.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Greg Hellings wrote:
>>>>> Troy,
>>>>>
>>>>> I've written up a log processor for the download statistics.   
>>>>> It's the
>>>>> executable .py file in my user directory on the server.  Below  
>>>>> is an
>>>>> example run of it:
>>>>>
>>>>> [ghellings at www ~]$ ./process_log.py ESV <path-to-log snipped>
>>>>> Total downloads: 362
>>>>> Unique downloads: 210
>>>>>
>>>>> It will accept as many files on the command line as you desire and
>>>>> report their statistics in aggregate.  Such is most useful for
>>>>> maintaining information about the IP-address across the multiple
>>>>> files.  It also works for the FTP files, but for those, relying  
>>>>> on the
>>>>> total downloads is misleading, since it reports individual  
>>>>> downloads
>>>>> of both new AND old testament .bz* files.  Thus, each individual
>>>>> download of the module should crop up as about 6 files in the  
>>>>> "total
>>>>> downloads" section.  Unique downloads are based solely on IP  
>>>>> address.
>>>>> As an example of the discrepancy of the counting:
>>>>>
>>>>> [ghellings at www ~]$ ./process_log.py ESV <path-to-log snipped>
>>>>> Total downloads: 540
>>>>> Unique downloads: 84
>>>>>
>>>>> Examples for comparison:
>>>>> [ghellings at www ~]$ ./process_log.py KJV <ftp log>
>>>>> Total downloads: 2098
>>>>> Unique downloads: 163
>>>>> [ghellings at www ~]$ ./process_log.py KJV <http log>
>>>>> Total downloads: 342
>>>>> Unique downloads: 198
>>>>>
>>>>> Those stats are based off of the currently in-use log files.  If  
>>>>> you
>>>>> would like a version of the script that will also report all  
>>>>> module
>>>>> download totals, that can be provided for little extra work.
>>>>>
>>>>> --Greg
>>>>>
>>>>>
>>>>> On Tue, Aug 19, 2008 at 4:14 PM, Greg Hellings <greg.hellings at gmail.com 
>>>>> >
>>>>> wrote:
>>>>>> Troy,
>>>>>>
>>>>>> On Tue, Aug 19, 2008 at 4:04 PM, Troy A. Griffitts <scribe at crosswire.org 
>>>>>> >
>>>>>> wrote:
>>>>>>> Hey guys.  We have a few needs which need addressing:
>>>>>>>
>>>>>>> Log files got a new naming convention recently.  Instead of:
>>>>>>>
>>>>>>> ffff
>>>>>>> ffff.1
>>>>>>> ffff.2
>>>>>>> ...
>>>>>>>
>>>>>>> It has become
>>>>>>>
>>>>>>> ffff
>>>>>>> ffff-20080819
>>>>>>> ffff-20080818
>>>>>>> ...
>>>>>>>
>>>>>>> Hence our perl scripts that generate module statistics are not  
>>>>>>> working,
>>>>>>> seen on the left panel here:
>>>>>> I don't know thing 1 on Perl, so editing that is out for me.  A
>>>>>> rewrite is possible into Python if no one with Perl knowledge  
>>>>>> shows
>>>>>> up.
>>>>>>
>>>>>>> http://crosswire.org/sword
>>>>>>>
>>>>>>> Also, Crossway asks for periodic download statistics for their  
>>>>>>> ESV
>>>>>>> module.  I generated the last report for them by hand, but I  
>>>>>>> would love
>>>>>>> for someone to write a script that would run on the first of  
>>>>>>> each month
>>>>>>> and email them statistics for the previous month.
>>>>>> What format is the file in (I'm guessing it's an Apache file  
>>>>>> access
>>>>>> log)?  A simple Python script should be more than sufficient  
>>>>>> for this
>>>>>> purpose.  I can probably whip one up in little time.  Also, what
>>>>>> statistics are you in need of -- just a download count or do  
>>>>>> you also
>>>>>> want to have information on the unique IP address downloads,  
>>>>>> etc.  A
>>>>>> sample of one line of the file (or multiple lines, if a file  
>>>>>> access is
>>>>>> spread across several lines) which pertains to the ESV should be
>>>>>> sufficient to base the work off of -- more would be appropriate  
>>>>>> if
>>>>>> there are multiple formats the line appears in.  Also, odds are  
>>>>>> good
>>>>>> that the same script can be used to generate the statistics for  
>>>>>> any
>>>>>> individual module.
>>>>>>
>>>>>> --Greg
>>>>>>
>>>>>>> Any takers?
>>>>>>>
>>>>>>>       -Troy.
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>>>
>>>>> _______________________________________________
>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>> Instructions to unsubscribe/change your settings at above page
>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page





More information about the sword-devel mailing list