truthling February 2016

wget recursive not working as expected

Wondering if I am overlooking the obvious

I am trying to use

wget -rl 0 -A "*.fna.gz" ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/Acinetobacter_nosocomialis/all_assembly_versions

To download all the files in all the directories contained in ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/Acinetobacter_nosocomialis/all_assembly_versions/ that match *.fna.gz

If you visit the above link, you will see a list of directories starting with GCA. I want all the files in those directories that match *.fna.gz but I get nothing when I run the command. I'm wondering if wget is not recognizing the GCA* directories as directories, and this is the problem? Or is there something wrong with my wget command?

I am suspicious because when I try to download the directories with FileZilla I get:

GCA_000248315.2_ASM24831v2: Not a regular file
Error:  Critical file transfer error

Answers


Steffen Ullrich February 2016

These are not directories but links to somewhere else. There is no information in the file listing which gives the type of the target file, i.e. if directory or plain file or whatever. Thus wget will probably assume plain file and not follow it.


truthling February 2016

Apparently this isn't working as expected because of a bug on the server which displays symbolic links to directories as ordinary files. Thus as @Steffen Ullrich mentioned, "There is no information in the file listing which gives the type of the target file, i.e. if directory or plain file or whatever. Thus wget will probably assume plain file and not follow it." Thanks to codesquid_ on the FileZilla IRC for the clarification.

Follow up question regarding a work around at recursive wget can't get files within symbolic link directories

Post Status

Asked in February 2016
Viewed 3,918 times
Voted 9
Answered 2 times

Search




Leave an answer