Discussion:
'My indexing' broken by 3.07
Charles Johnson
2017-12-16 13:33:25 UTC
Permalink
get_iplayer --type=radio --refresh >$RADIO_FILE

was the content of a script with which i built myself a text index of
programmes (there was possibly a more efficient way to derive the index
from the cache?). That no longer works in 3.07.

This is what i get

get_iplayer v3.07, Copyright (C) 2008-2010 Phil Lewis
  This program comes with ABSOLUTELY NO WARRANTY; for details use --warranty.
  This is free software, and you are welcome to redistribute it under certain
  conditions; use --conditions for details.


INFO: Indexing radio programmes (concurrent)
...............................................................
INFO: Added 0 radio programmes to cache

Is there some way i can do the same without sticking to 3.06?
Mark Carroll
2017-12-16 13:39:38 UTC
Permalink
Post by Charles Johnson
get_iplayer --type=radio --refresh >$RADIO_FILE
was the content of a script with which i built myself a text index of
programmes (there was possibly a more efficient way to derive the index
from the cache?). That no longer works in 3.07.
(snip)
Post by Charles Johnson
INFO: Indexing radio programmes (concurrent)
...............................................................
INFO: Added 0 radio programmes to cache
Is there some way i can do the same without sticking to 3.06?
Are you running into the very first bullet point from
https://github.com/get-iplayer/get_iplayer/wiki/release300to309#release307
?

] If you wish to list all programmes, you must now explicitly specify a
] wildcard search: get_iplayer ".*" - note the quotes.

-- Mark
Charles Johnson
2017-12-16 13:54:15 UTC
Permalink
Post by Mark Carroll
] If you wish to list all programmes, you must now explicitly specify a
] wildcard search: get_iplayer ".*" - note the quotes.
-- Mark
Thanks so much for that Mark. That looks like a regex. Is it, do you know?
Ralph Corderoy
2017-12-16 14:14:50 UTC
Permalink
Hi Charles,
Post by Charles Johnson
Post by Mark Carroll
] wildcard search: get_iplayer ".*" - note the quotes.
Thanks so much for that Mark. That looks like a regex. Is it, do you know?
Yes. `^' also suffices.
--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy
Charles Johnson
2017-12-16 14:16:35 UTC
Permalink
Post by Ralph Corderoy
Yes. `^' also suffices.
Interesting. I wonder if 'match beginning of the line' is less expensive
internally?
Ralph Corderoy
2017-12-16 14:35:40 UTC
Permalink
Hi Charles,
Post by Charles Johnson
Post by Ralph Corderoy
Yes. `^' also suffices.
Interesting. I wonder if 'match beginning of the line' is less
expensive internally?
Perl's regexp engine is historically extremely good at spotting
optimisations, and some of those details can be seen with its -D option
if perl is compiled suitably, e.g. /x.*foo$/ might decide the minimum
length is four and it must end with `foo' before attempting to find an
`x'.

But in this case, I think `^' is cheaper.

$ for p in ^ '.*'; do
Post by Charles Johnson
for n in 100 1000 10000; do
seq $n |
perf stat -e instructions \
perl -ne "/$p/"
done
done |&
grep instructions:u
2,588,069 instructions:u
4,947,485 instructions:u
28,715,945 instructions:u
2,600,183 instructions:u
5,089,466 instructions:u
30,189,787 instructions:u
$

Re-arranged, that's

n /^/ /.*/ /.*/
100 2,588,069 2,600,183 +12,114
1,000 4,947,485 5,089,466 +141,981
10,000 28,715,945 30,189,787 +1,473,842
--
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy
Charles Johnson
2017-12-16 15:59:22 UTC
Permalink
Post by Ralph Corderoy
But in this case, I think `^' is cheaper.
Well done you! My instincts told me that it might be cheaper ;)

Continue reading on narkive:
Loading...