[rt-users] Re: Generating static html files for crawler

Asif Iqbal vadud3 at gmail.com
Mon Feb 12 09:59:19 EST 2007


On 2/5/07, Asif Iqbal <vadud3 at gmail.com> wrote:
> On 1/31/07, Asif Iqbal <vadud3 at gmail.com> wrote:
> > Hi All
> >
> > Currently I am using a wget/perl script to generate static html pages
> > so that my crawler can index those html files. However `wget/perl'
> > script is giving my mysql a jump from a usual 1% cpu to now--when I
> > run the script--27% cpu.
> >
> > Is there a less expensive way (RT way) to generate exact static
> > replica of a default ticket page--look and feel as well?
> >
> > I am using a `for loop' and `rt show ticket/id' to generate a list of
> > valid ticket numbers and the
> > createstatic.pl file takes those numbers as arguments and creates
> > static html files.
> >
> > for example I am assuming--not sure how to get the latest ticket id
> > otherwise--my latest ticket id is 400000. So I run
> >
> >
> > for i in `seq 1 400000`; do rt show ticket/$i | grep -q id && echo $i
> > ; done >> tickets
> >
> > Then I run the next `for loop' to generate the static html pages
> >
> > for t in `cat tickets ` ; do perl createstatic.pl $t >
> > /var/apache/htdocs/tickets/${t}.html; sleep 2; done
> >
> > So now my crawler can index the static pages.
> >
> > Here is my createstatic.pl attached
>
>
> Anyone one know a better way--RT way--to slurp a ticket html page
> besides using wget or curl? I collect them as html pages for my
> crawler to index them

In case anyone missed the previous emails I am reposting the question.

How do I generate static html page of a ticket besides using `wget' or
`curl' which is
pretty expensive, resource wise?

>
> Thanks
>
> >
> > Thanks
> >
> > --
> > Asif Iqbal
> > PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
> >
> >
> >
>
>
> --
> Asif Iqbal
> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>


-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu



More information about the rt-users mailing list