Mike, bad headlines, see below - I don't know if they got through. (Skip bad header to content) It was nice talking to you last night.

I think around 20-100 lines of easy perl code could:

1. create a directory listing of html files exported by a feed reader (or search reader yet to be created YTBC).

2. run tidy on each file. Pandoc needs good html, so clean up with tidy first.

3. run pandoc on each file outputting markdown and sending all results to one file.

4. parse the "big" file using perl. the construct s/[(mm)]$(mm)$/ parses all markdown urls, where mm are regular expressions, and the variables $1 and $2 contain the link name and url respectively, so perl processing markdown should be easy. First process urls's using

s/[mm]$mm )/[$1]\($2$\n/g; first pass

# second pass code to process "grand" file for url's.

while(<input_markdown_file>) {

chop;

if ( s/\[(mm)\]$(mm)$ ) { # markdown is easy to recognize,

# parenthesis puts () content into

# variables $1 and $2

print "Url is $2 and title is $1\n"; # info to stdout

print FU2 "<br><a href='$2'>$1</a>\n"; # or something like this

# to print html file of

# links and titles

print FU3 "wget $parms $2\n" # print wget file, which you can execute

}

so each url is on its own line, and then run the above line-oriented script to extract any found url. This is much easier than parsing html. I have not tested this perl code: it is only written in the email tool.

5. Save a list of the urls and run wget on each url to get the big file.

6. Then repeat algorithm on list of wget'ed files.

I have not done this yet, but this is the next plan. It's the next evolutionary step. Funny I was already working on this before you called, so I am glad you called. If I succeed in doing this I will send you the perl script.

== KISS means keep it simple Iltis