use File::Fetch; ### build a File::Fetch object ### my $ff = File::Fetch->new(uri => 'http://some.where.com/dir/a.txt'); ### fetch the uri to cwd() ### my $where = $ff->fetch() or die $ff->error; ### fetch the uri to /tmp ### my $where = $ff->fetch( to => '/tmp' ); ### parsed bits from the uri ### $ff->uri; $ff->scheme; $ff->host; $ff->path; $ff->file;
It allows you to fetch any file pointed to by a "ftp", "http", "file", "git" or "rsync" uri by a number of different means.
See the "HOW IT WORKS" section further down for details.
On Windows this value may be empty if the uri is to a network share, in which case the 'share' property will be defined. Additionally, volume specifications that use '|' as ':' will be converted on read to use ':'.
On VMS, which has a volume concept, this field will be empty because VMS file specifications are converted to absolute UNIX format and the volume information is transparently included.
http://example.com/index.html?x=y
would make the output file be "index.html" rather than "index.html?x=y".
By default it writes to "cwd()", but you can override that by specifying the "to" argument:
### file fetch to /tmp, full path to the file in $where $where = $ff->fetch( to => '/tmp' ); ### file slurped into $scalar, full path to the file in $where ### file is downloaded to a temp directory and cleaned up at exit time $where = $ff->fetch( to => \$scalar );
Returns the full path to the downloaded file on success, and false on failure.
Below is a mapping of what utilities will be used in what order for what schemes, if available:
file => LWP, lftp, file http => LWP, HTTP::Tiny, wget, curl, lftp, fetch, HTTP::Lite, lynx, iosock ftp => LWP, Net::FTP, wget, curl, lftp, fetch, ncftp, ftp rsync => rsync git => git
If you'd like to disable the use of one or more of these utilities and/or modules, see the $BLACKLIST variable further down.
If a utility or module isn't available, it will be marked in a cache (see the $METHOD_FAIL variable further down), so it will not be tried again. The "fetch" method will only fail when all options are exhausted, and it was not able to retrieve the file.
The "fetch" utility is available on FreeBSD. NetBSD and Dragonfly BSD may also have it from "pkgsrc". We only check for "fetch" on those three platforms.
"iosock" is a very limited IO::Socket::INET based mechanism for retrieving "http" schemed urls. It doesn't follow redirects for instance.
"git" only supports "git://" style urls.
A special note about fetching files from an ftp uri:
By default, all ftp connections are done in passive mode. To change that, see the $FTP_PASSIVE variable further down.
Furthermore, ftp uris only support anonymous connections, so no named user/password pair can be passed along.
"/bin/ftp" is blacklisted by default; see the $BLACKLIST variable further down.
Default is "File-Fetch@example.com".
Default is "File::Fetch/$VERSION".
Default value is 1.
Note: When $FTP_PASSIVE is true, "ncftp" will not be used to fetch files, since passive mode can only be set interactively for this binary
Set to false to silence warnings. Inspect the output of the "error()" method manually to see what went wrong.
Good for tracking down why things don't work with your particular setup.
To disallow the use of, for example, "LWP" and "Net::FTP", you could set $File::Fetch::BLACKLIST to:
$File::Fetch::BLACKLIST = [qw|lwp netftp|]
The default blacklist is [qw|ftp|], as "/bin/ftp" is rather unreliable.
See the note on "MAPPING" below.
You can reset this cache by assigning an empty hashref to it, or individually remove keys.
See the note on "MAPPING" below.
LWP => lwp HTTP::Lite => httplite HTTP::Tiny => httptiny Net::FTP => netftp wget => wget lynx => lynx ncftp => ncftp ftp => ftp curl => curl rsync => rsync lftp => lftp fetch => fetch IO::Socket => iosock
$ENV{ftp_proxy} = 'foo.com';
Refer to the LWP::UserAgent manpage for more details.
Sadly, "lynx" doesn't support any options to return a different exit code on non-"200 OK" status, giving us no way to tell the difference between a 'successful' fetch and a custom error page.
Therefor, we recommend to only use "lynx" as a last resort. This is why it is at the back of our list of methods to try as well.
If you have any other characters you need to escape, please install the "URI::Escape" module from CPAN, and pre-encode your URI before passing it to "File::Fetch". You can read about the details of URIs and URI encoding here:
http://www.faqs.org/rfcs/rfc2396.html