'My indexing' broken by 3.07

Discussion:

Charles Johnson

2017-12-16 13:33:25 UTC

get_iplayer --type=radio --refresh >$RADIO_FILE

was the content of a script with which i built myself a text index of
programmes (there was possibly a more efficient way to derive the index
from the cache?). That no longer works in 3.07.

This is what i get

get_iplayer v3.07, Copyright (C) 2008-2010 Phil Lewis
This program comes with ABSOLUTELY NO WARRANTY; for details use --warranty.
This is free software, and you are welcome to redistribute it under certain
conditions; use --conditions for details.

INFO: Indexing radio programmes (concurrent)
...............................................................
INFO: Added 0 radio programmes to cache

Is there some way i can do the same without sticking to 3.06?

Mark Carroll

2017-12-16 13:39:38 UTC

Permalink

Post by Charles Johnson
get_iplayer --type=radio --refresh >$RADIO_FILE
was the content of a script with which i built myself a text index of
programmes (there was possibly a more efficient way to derive the index
from the cache?). That no longer works in 3.07.

(snip)

Post by Charles Johnson
INFO: Indexing radio programmes (concurrent)
...............................................................
INFO: Added 0 radio programmes to cache
Is there some way i can do the same without sticking to 3.06?

Are you running into the very first bullet point from
https://github.com/get-iplayer/get_iplayer/wiki/release300to309#release307
?

] If you wish to list all programmes, you must now explicitly specify a
] wildcard search: get_iplayer ".*" - note the quotes.

-- Mark

Charles Johnson

2017-12-16 13:54:15 UTC

Permalink

Post by Mark Carroll
] If you wish to list all programmes, you must now explicitly specify a
] wildcard search: get_iplayer ".*" - note the quotes.
-- Mark

Thanks so much for that Mark. That looks like a regex. Is it, do you know?

Ralph Corderoy

2017-12-16 14:14:50 UTC

Permalink

Hi Charles,

Post by Charles Johnson

Post by Mark Carroll
] wildcard search: get_iplayer ".*" - note the quotes.

Thanks so much for that Mark. That looks like a regex. Is it, do you know?

Yes. `^' also suffices.

--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

Charles Johnson

2017-12-16 14:16:35 UTC

Permalink

Post by Ralph Corderoy
Yes. `^' also suffices.

Interesting. I wonder if 'match beginning of the line' is less expensive
internally?

Ralph Corderoy

2017-12-16 14:35:40 UTC

Permalink

Hi Charles,

Post by Charles Johnson

Post by Ralph Corderoy
Yes. `^' also suffices.

Interesting. I wonder if 'match beginning of the line' is less
expensive internally?

Perl's regexp engine is historically extremely good at spotting
optimisations, and some of those details can be seen with its -D option
if perl is compiled suitably, e.g. /x.*foo$/ might decide the minimum
length is four and it must end with `foo' before attempting to find an
`x'.

But in this case, I think `^' is cheaper.

$ for p in ^ '.*'; do

Post by Charles Johnson
for n in 100 1000 10000; do
seq $n |
perf stat -e instructions \
perl -ne "/$p/"
done
done |&
grep instructions:u

2,588,069 instructions:u
4,947,485 instructions:u
28,715,945 instructions:u
2,600,183 instructions:u
5,089,466 instructions:u
30,189,787 instructions:u
$

Re-arranged, that's

n /^/ /.*/ /.*/
100 2,588,069 2,600,183 +12,114
1,000 4,947,485 5,089,466 +141,981
10,000 28,715,945 30,189,787 +1,473,842

--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

Charles Johnson

2017-12-16 15:59:22 UTC

Permalink

Post by Ralph Corderoy
But in this case, I think `^' is cheaper.

Well done you! My instincts told me that it might be cheaper ;)

Continue reading on narkive:

Search results for ''My indexing' broken by 3.07' (Questions and Answers)

replies

Broken links effects in SEO ?

started 2015-08-14 07:30:40 UTC

search engine optimization