Discussion:
Dual boot and get_iplayer
RS
2018-10-29 20:39:46 UTC
Permalink
My main PC dual boots between Windows 10 1803 and kubuntu 18.04.1 LTS.
I would like to be able to run get_iplayer on whichever OS happens to be
running, as I do, for example, with Thunderbird, where I have both
profiles pointing to the same message base.

In the case of get_iplayer I would need to point the --profile-dir for
both installations to the same NTFS directory containing the files
options, tv.cache, radio.cache and download_history.

We had a long discussion back in March about file formats, because when
I copied the get_iplayer files from Windows 10 to ubuntu the options
file was interpreted wrongly. Ralph diagnosed the problem.

http://lists.infradead.org/pipermail/get_iplayer/2018-March/011536.html

It was that the files written in Windows used CRLF as a line terminator
(DOS format) and files written in Linux used LF as a line terminator
(Unix format). The files tv.cache, radio.cache and download_history
rely on separators to separate records and fields, so the different line
terminators do not matter. The options file does rely on the line
terminator LF to separate records and for some options the presence of a
CR causes the option to be wrongly interpreted.

Ralph kindly suggested some code to convert files from DOS format to
Unix format.
`od -c foo' will show them, and they can be removed with
`sed -i 's/\r$//' foo'.
I have since found an easier way to do the conversion. If I load a file
into the Linux nano text editor it tells me if there are some DOS format
lines. When I save it I can type alt-D to toggle between saving it in
Unix format and saving it in DOS format.

That is not the end of the problem. Ralph also pointed out that if Perl
thought it was running under Windows it would convert LF to CRLF when
writing a file. I think I have now found a solution. Windows 10 now
supports the Linux Bash Shell and I have used it to install ubuntu to
run under Windows 10. That does seems to work, and Perl is generating
files in Unix format, so it does not seem to recognise that it is
running under Windows, which is what I wanted.

At present I am still testing it with default file locations. I have
run into a problem I have so far been unable to solve; it hangs while
tagging. If I repeat the run with get_iplayer --force --tag-only
--verbose it displays the metadata it has processed followed by
Started writing to temp file.
Progress: > 0% ------------------------------------------------------|

It does not move from 0%. I have probably done something silly so that
I have prevented the temp file from being created or written to,
although I would then have expected an error message. I have looked at
the get_iplayer v3.17 script to try to see what it is doing when it
prints those strings. There is a comment "Tagger Class" at line 8814.
I cannot see the strings "Started" or "temp " or "Progress:" there or
anywhere else in the script.

Any ideas?

Best wishes
Richard
Charles Johnson
2018-10-30 11:45:02 UTC
Permalink
That is not the end of the problem.  Ralph also pointed out that if
Perl thought it was running under Windows
I know very little Perl, but i'm surprised it should care about line
separators. In Java, one of the oldest classes for reading text files
(BufferedReader) doesn't care what line separators it finds. Roughly
speaking it will split on /[\r\n]+/ (though not via regex) and will will
thus read the text files of any platform correctly. Surely there's at
least a module that will do the same in Perl?

Yes, _writing_ is a different matter and a decision would have to be made
RS
2018-10-30 16:46:56 UTC
Permalink
On 30/10/2018 11:45, Charles Johnson wrote:
...
Post by Charles Johnson
Yes, _writing_ is a different matter and a decision would have to be made
A file has to be written before it can be read. If Perl writes a file
when running under Linux it will write LF as the line terminator. When
running under Windows it will write CR LF. When reading the file under
Windows, Perl will strip out the CR, so the calling script will not see
it. If a file has been written under Windows and is being read under
Linux the CR characters will not be stripped out, but will be returned
as part of the record.

When reading its tv.cache, radio.cache and download_history files
get_iplayer does not care what the line terminator is because it uses |
as a separator. For the options file it uses the line terminator as a
separator. For some options the presence of a CR does not matter. For
others it causes the option to be garbled.

We had a long discussion about this in March in the thread I linked to,
so I don't want to go over old ground. What I wanted to ask was

1. Has anyone else tried running get_iplayer under the Windows bash shell?

2. Can anyone shed any light on why AtomicParsley hangs when run under
the Windows bash shell?

Best wishes
Richard
Charles Johnson
2018-10-30 17:44:23 UTC
Permalink
For the options file it uses the line terminator as a separator.  For
some options the presence of a CR does not matter.  For others it
causes the option to be garbled.
I don't have an options file and have never used one, but it would be
useful to know the specifics on this so we know what breaks and why
RS
2018-10-31 13:23:05 UTC
Permalink
Post by Charles Johnson
For the options file it uses the line terminator as a separator.  For
some options the presence of a CR does not matter.  For others it
causes the option to be garbled.
I don't have an options file and have never used one, but it would be
useful to know the specifics on this so we know what breaks and why
This is what I wrote on 3rd March.

Is the format of the options file different in Windows and Linux? I am
setting up my machines to dual boot between Windows 10 and ubuntu 16.04.
On my main machine I wanted to copy the history file. I thought while
I was doing that I might as well copy the whole hidden .get_iplayer
directory including the cache and options files. I expected to have to
change the --output option since the path is different, but some of the
options don't work, and others do.

The criterion seems to be that options with a numeric or boolean value
like --subtitles or --expiry work.

Options with a string value like --tvmode, --radiomode or
--ffmpegloglevel don't. For example with --tvmode=hlshd I get

INFO: Downloading tv: 'Collateral: Series 1 - 2. Episode 2 (b09sym6m)
[original]'
) available for this programme with version 'original'
INFO: Available modes:
hlshd,hlsvhigh,hvfhd,hvfsd,hvfxsd,hvfhigh,hvfxhigh,hvfstd,hvflow

When I re-enter --prefs-add --tvmode=hlshd
--ffmpeg-loglevel=info gives an error

INFO: Downloading tv: 'Collateral: Series 1 - 2. Episode 2 (b09sym6m)
[original]'
') - using defaultalue for --ffmpeg-loglevel ('info

If I enter the options again with --prefs-add they work correctly, so it
is no big deal. Any ideas what is happening?

End of quotation.

As you can see, for options with a string value the presence of a
spurious CR not only causes the option to be misinterpreted, it causes
some of the INFO: text sent to stdout to be lost. For options with a
numeric or boolean value, it does not seem to matter.

Best wishes
Richard
David Cantrell
2018-10-31 13:03:57 UTC
Permalink
Post by Charles Johnson
That is not the end of the problem.  Ralph also pointed out that if
Perl thought it was running under Windows
I know very little Perl, but i'm surprised it should care about line
separators. In Java, one of the oldest classes for reading text files
(BufferedReader) doesn't care what line separators it finds. Roughly
speaking it will split on /[\r\n]+/ (though not via regex) and will will
thus read the text files of any platform correctly. Surely there's at
least a module that will do the same in Perl?
In perl the default is to assume the local machine's line seperator. You
can of course change this. There's probably something on the CPAN that
will wrap it up all neat and tidy so that you don't have to worry about
writing portable code.
--
David Cantrell | Official London Perl Mongers Bad Influence

There are many different types of sausages. The best are
from the north of England. The wurst are from Germany.
-- seen in alt.2eggs...
MacFH - C E Macfarlane
2018-10-30 17:13:18 UTC
Permalink
Please see below ...
Post by RS
My main PC dual boots between Windows 10 1803 and kubuntu 18.04.1 LTS.
I would like to be able to run get_iplayer on whichever OS happens to
be running, as I do, for example, with Thunderbird, where I have both
profiles pointing to the same message base.
In the case of get_iplayer I would need to point the --profile-dir for
both installations to the same NTFS directory containing the files
options, tv.cache, radio.cache and download_history.
No, although you appear to have solved some of the problems, I do not
believe this can work satisfactorily for the sort of reasons you have
discovered.
Post by RS
We had a long discussion back in March about file formats, because
when I copied the get_iplayer files from Windows 10 to ubuntu the
options file was interpreted wrongly.  Ralph diagnosed the problem.
http://lists.infradead.org/pipermail/get_iplayer/2018-March/011536.html
It was that the files written in Windows used CRLF as a line
terminator (DOS format) and files written in Linux used LF as a line
terminator (Unix format).  The files tv.cache, radio.cache and
download_history rely on separators to separate records and fields, so
the different line terminators do not matter.  The options file does
rely on the line terminator LF to separate records and for some
options the presence of a CR causes the option to be wrongly interpreted.
Yes, there are problem with both OSs reading the other's files ...

    Windows doesn't recognise LF on its own as a line terminator, and
therefore the lines either side of one effectively get concatenated with
an oddball character in between.

    Linux doesn't recognise CR as part of the line terminator, even
though it is followed by LF, and therefore appends it to the line data.

... and I gave an example of each and how it had caused obscure bugs
that had wasted my time considerably in the past.
Post by RS
Ralph kindly suggested some code to convert files from DOS format to
Unix format.
`od -c foo' will show them, and they can be removed with
`sed -i 's/\r$//' foo'.
I have since found an easier way to do the conversion.  If I load a
file into the Linux nano text editor it tells me if there are some DOS
format lines.  When I save it I can type alt-D to toggle between
saving it in Unix format and saving it in DOS format.
What a palaver, see below.
Post by RS
That is not the end of the problem.  Ralph also pointed out that if
Perl thought it was running under Windows it would convert LF to CRLF
when writing a file.  I think I have now found a solution. Windows 10
now supports the Linux Bash Shell and I have used it to install ubuntu
to run under Windows 10.  That does seems to work, and Perl is
generating files in Unix format, so it does not seem to recognise that
it is running under Windows, which is what I wanted.
At present I am still testing it with default file locations.  I have
run into a problem I have so far been unable to solve; it hangs while
tagging.  If I repeat the run with get_iplayer --force --tag-only
--verbose it displays the metadata it has processed followed by
 Started writing to temp file.
 Progress: >  0% ------------------------------------------------------|
It does not move from 0%.  I have probably done something silly so
that I have prevented the temp file from being created or written to,
although I would then have expected an error message.  I have looked
at the get_iplayer v3.17 script to try to see what it is doing when it
prints those strings.  There is a comment "Tagger Class" at line 8814.
I cannot see the strings "Started" or "temp " or "Progress:" there or
anywhere else in the script.
I can't suggest a particular reason for this, other than checking
whether AtomicParsley is being found under your unusual arrangement, and
whether the parameters being fed to it are exactly what is expected,
including checking for possible embedded CRs and LFs.  One way of
finding these is to bracket a log message with, say ">>>" and "<<<"  - 
in other words, instead of a debug message like (this is not taken from
get_iplayer, I'm just demonstrating the principle) ...
    "Calling Atomic Parsley with the following command-line: ".$comandline
... alter it to something like ...
    ">>>Calling Atomic Parsley with the following command-line:
".$comandline."<<<"
... then, when examining the output, if there is an orphan "<<<" on a
line by itself, you know that someone somehow has inserted an extra
line-terminator.

But generally, I would not adopt your approach to solving this problem. 
I have five dual-boot PCs (though currently I can only use the 3 that
are laptops because my monitor caught fire) and 2 embedded linux NASs 
and have needed to reconcile get_iplayer configuration and search data
between them, and to do this, I wrote a script called gip.pl, which
takes care to write the correct line-endings for the target OS. 
Although it's designed to reconcile over a LAN between whatever OSs are
running on dual-boot PCs, I think you could modify it fairly easily to
reconcile between two locally mounted get_iplayer configurations.  If
you were able to achieve this, you could then run gip.pl instead of
get_iplayer directly, and if you gave it the appropriate command-line
parameters, it would reconcile the two configurations each time it was
run before calling get_iplayer(.pl) to perform the downloads. I began to
document it, but didn't get very far, but the script has, I think and
hope, fairly comprehensive self-documentation in the form of comments:
    www.macfh.co.uk/Test/GiP.html
MacFH - C E Macfarlane
2018-10-31 10:02:05 UTC
Permalink
Apologies for broken link ...
Post by MacFH - C E Macfarlane
Please see below ...
But generally, I would not adopt your approach to solving this
problem.  I have five dual-boot PCs (though currently I can only use
the 3 that are laptops because my monitor caught fire) and 2 embedded
linux NASs  and have needed to reconcile get_iplayer configuration and
search data between them, and to do this, I wrote a script called
gip.pl, which takes care to write the correct line-endings for the
target OS.  Although it's designed to reconcile over a LAN between
whatever OSs are running on dual-boot PCs, I think you could modify it
fairly easily to reconcile between two locally mounted get_iplayer
configurations.  If you were able to achieve this, you could then run
gip.pl instead of get_iplayer directly, and if you gave it the
appropriate command-line parameters, it would reconcile the two
configurations each time it was run before calling get_iplayer(.pl) to
perform the downloads. I began to document it, but didn't get very
far, but the script has, I think and hope, fairly comprehensive
    www.macfh.co.uk/Test/GiP.html
When I posted this yesterday evening, I didn't think to check that the
download link was actually working, which for reasons too complicated
and OT to be worth going into here, it wasn't.  I've changed the
filename and instructions accordingly, and now it is.
RS
2018-10-31 12:59:11 UTC
Permalink
Post by MacFH - C E Macfarlane
Apologies for broken link ...
When I posted this yesterday evening, I didn't think to check that the
download link was actually working, which for reasons too complicated
and OT to be worth going into here, it wasn't.  I've changed the
filename and instructions accordingly, and now it is.
I am sure you are right that a script is a better approach to
synchronising get_iplayer files between installations.

What finally convinced me I was wasting my time with my approach was
what I wrote at the end of my post yesterday morning (which took several
hours to arrive) when I realised that the --output option in kubuntu
would be
--output=/media/user/label/directory
whereas in Windows bash shell for the F:\ drive it would be
--output=mnt/f/directory
If I was going to need a script to change the --output option I might
just as well use it to change the line terminators.

Thanks for sight of your script. At 2700 lines of Perl I think it's
probably a lot more complicated than I need.

As for solving my problem with AtomicParsley, debugging C++ called from
Perl when my knowledge of both is sorely limited may be a step too far.
Now that I have discovered that the capitalisation of AtomicParsley
matters when invoking it, I'll have a play with running it manually.

Best wishes
Richard
MacFH - C E Macfarlane
2018-10-31 15:10:23 UTC
Permalink
Please see below ...
Post by MacFH - C E Macfarlane
When I posted this yesterday evening, I didn't think to check that
the download link was actually working, which for reasons too
complicated and OT to be worth going into here, it wasn't. I've
changed the filename and instructions accordingly, and now it is.
Thanks for sight of your script.  At 2700 lines of Perl I think it's
probably a lot more complicated than I need.
Yes, 2700 lines is rather intimidating, but actually that arises from
the complexity of what needs to be done, and nearly all the work has
been done for you.  Your main tasks to convert the script for your own
needs will be:

    1)    Defining a suitable PC list, see lines 261ff
    2)    Probably adapting the subroutine/function  MountShare, see
lines 664ff, to mount a pre-existing local directory for one or more of
the pre-defined peer PCs, instead of trying to connect to it over the
network.

However you *may* be able to achieve this by defining suitably the peer
PC in step 1, without changing anything in MountShare in step 2  - 
after all, it's perfectly valid for a PC to connect to itself over the
network!

As an example command-line, I don't usually need to reconcile between
PCs on a daily basis, but I call gip.pl every evening before going to
bed, giving it a command-line which reflects what I've spotted in
Digiguide, and it downloads everything overnight.  Last night's command was:

        perl gip.pl a "Empire" "The Ghan" "Origins Of Us" --hide -g

        The first parameter is for gip.pl:
        a         All - download any hits from all of the four
pre-defined search lists  -  HDTV, TV, Hidef Radio, Radio

        The rest are standard get_iplayer parameters:
        "Empire" "The Ghan" "Origins Of Us"    Additionally download
any hits from these search terms
        -- hide        Don't show search hits that have already been
downloaded
        -g            Get the downloads

You would need an additional parameter before the 'a', something like ...
        -m:<hostname>,MSWin32
... or ...
        -m:<hostname>,Linux
... of which I presume the former is more likely, as I don't know of
many drivers to read Linux disk formats from Windows  -  they seem to
exist, but when I last researched this, ISTR they had a poor reputation 
-  whereas Windows NTFS disk formats can be both read and written from
within Linux.  To clarify, when gip.pl is reconciling, the source
line-endings are determined by the OS under which it is running, while
the destination line-endings are based on the Linux/MSWin2 parameter
when defining the target PC, either in the peer PC list, @PCs,
(preferred), or else from the command-line using defaults (may work, but
not recommended).

As far as the complexity of the script goes, some thought will make you
realise the reason for it  -  what at first seems a simple task,
actually turns out to be quite complex when you start to think about it:

    1)    Reconciliation between two configurations, which can  be
copying data from one configuration to the other, say newest to oldest,
or merging it.  Take the download history for example, .  To copy it
from one to another ...
            a)    Copy the file, changing the line-endings if required
            b)    Replace the paths of the downloaded files with the
output directory for the target configuration
... while to merge the data between the two, at very least you need to:
            a)    Strip out the paths from both the download histories
            b)    Merge sort, removing duplicates
            c)    Reinstate the correct paths for each of the two
configurations
... while you may wish also to remove other types of duplicates, say if
the same programme has been downloaded multiple times at different
resolutions, or otherwise clean up the data, say by throwing away
unwanted fields to reduce the file size.

Similarly, merging the options is more complex than may at first appear,
because some, such as exclusion terms, may be safely merged, while
others, such as local program paths, absolutely must not be merged!

    2)    Cycle through a number of search lists calling get_iplayer to
download each one with different parameters appropriate to each list.
RS
2018-10-30 11:05:23 UTC
Permalink
Post by RS
At present I am still testing it with default file locations.  I have
run into a problem I have so far been unable to solve; it hangs while
tagging.  If I repeat the run with get_iplayer --force --tag-only
--verbose it displays the metadata it has processed followed by
 Started writing to temp file.
 Progress: >  0% ------------------------------------------------------|
It does not move from 0%.
It seems those words were written by AtomicParsley. "Started writing"
occurs at line 4859 of parsley.cpp and "Progress:" at 4360. I am going
to have to work out how get_iplayer invokes AtomicParsley to write to a
temp file. That is unfortunate because although AtomicParsley
documentation describes the tags and the format of a .mp4 file, there is
very little on how to invoke it.

I have also realised that even if I solve this problem I am still going
to have a problem with the --output option in the options file. The
kubuntu version will need to be

--output=/media/user/label/directory

while the Windows bash shell version will need to be (if writing to the
F:\ drive)

--output=/mnt/f/directory

so I can't have a single options file common to both installations. I
guess I'll need a script to edit the options file every time I start
get_iplayer.

Best wishes
Richard
RS
2018-11-02 12:00:37 UTC
Permalink
Post by RS
At present I am still testing it with default file locations.  I have
run into a problem I have so far been unable to solve; it hangs while
tagging.  If I repeat the run with get_iplayer --force --tag-only
--verbose it displays the metadata it has processed followed by
  Started writing to temp file.
  Progress: >  0% ------------------------------------------------------|
It does not move from 0%.
It seems those words were written by AtomicParsley.  "Started writing"
occurs at line 4859 of parsley.cpp and "Progress:" at 4360.  I am going
to have to work out how get_iplayer invokes AtomicParsley to write to a
temp file.  That is unfortunate because although AtomicParsley
documentation describes the tags and the format of a .mp4 file, there is
very little on how to invoke it.
AtomicParsley does create a temporary file, and it does write to it.
What it writes seems to start correctly with ftyp, moov, mvhd, trak,
tkhd, edts and so on atoms. The problem is that it abruptly stops. For
example last Sunday's Andrew Marr programme with hvfxsd3 was 813MByte.
The temp file stopped being written after 4365 bytes. The latest Film
Review was 137MByte. The temp file stopped after 196kByte. There is no
error message; it just hangs.

For debugging I can invoke AtomicParsley manually with something like

AtomicParsley /path/filename.mp4 "--description" "presenter"

but it probably is not worth the effort.

Best wishes
Richard

Loading...