problem about wget-Collection of common programming errors

11 years ago

admin

12 minutes

soulSurfer2010
linux wget
I was using wget the in the last week to recursively download a whole website of html pages. I used it this way:wget –recursive –no-clobber –page-requisites –html-extension –convert-links –domains XXXX.com –no-parent http://www.XXXX.comthe issue is, since the downloading took couple of days, sometimes there were connection timeouts , network disconnections etc, and while it happened, seems like wget skipped the htmls it couldnt fetch, which is not good in this case.I wonder if there is a
Matt
wget
I’m trying to download a large 2GB file with wget, but after a few minutes it seems to keep stalling.I ^C it, and wget it again [with –continue option], and it starts downloading again. Is there a way to automate retrying the download when it stops downloading? Thanks
surfealokesea
java linux wget socket-timeout-exception
I have a web crawler in java who downloads web pages to get info via a cronjob.This crawler is working for other .com .cat urls, but I’ve just added a new site (http://www.harlemjazzclub.es/barcelona/propers-concerts/) and this one is giving me a connection timeout error (but it’s not a real timeout error, there should be an adjacent cause). Tried with java code (1):URL _url=new URL(url); URLConnection con=_url.openConnection(); con.setConnectTimeout(timeout); doc = Jsoup.parse(con.getInputStrea
blahdiblah
linux bash ftp wget
I have a minor bash script loop I’m trying to learn and accomplish but it’s giving me a hard time to comprehend with all the readings I’ve done. I am uploading to a ftp server, then I want to check to see if the upload went through successfully by using wget spider method. If it returns OK then we’re good, if not then I want to redo the ftp upload once more.The code I have below are based on my research and learning from here. If I could be doing this better, please help me revise it so I can le
chowden
bash shell wget
I’m currently writing a Bash script to simulate web activity. The idea is the script runs for approximately three hours, then terminates. I’ve no problem with wget, the oneliner works perfectly at the cmd line Code is here: #!/bin/basha=$(date +%s) echo current time is $a b=$(($a + 10800)) # finish time is start time plus 10800 number of seco\# nds in three hours echo finish time is $b for line in “url_list.txt”; doecho current url is $linedate=$(date +%s)echo current time is $cwget -e \-r -p
caleban
bash shell wget tar
How can I automate the following with a bash shell script using word designators and word modifiers or something similar?root@server:/tmp# wget -q http://download.zeromq.org/zeromq-2.2.0.tar.gz root@server:/tmp# tar -xzf !$:t tar -xzf zeromq-2.2.0.tar.gz root@server:/tmp# cd !$:r:r cd zeromq-2.2.0 root@server:/tmp/zeromq-2.2.0#When I try something like the below I get errors because word designators and word modifiers don’t appear to work the same way in bash scripts as they do in a shell:Bash S
Max13
linux bash pipe wget pipeline
I want to make a bash script, which have to be wget and send to bash like that: wget -q -O – http://pastebin.com/raw.php?i=VURksJnn | bashThe “file” is a test script, this commands shows me: “Unknown command” (maybe due to new lines) and “Unexpected end of file”, which I don’t know why.Am I missing something ? Thanks for your help!
alditis
php curl wget crontab
I have a question about what command to use to run a PHP script from Crontab.Which of these alternatives would be the most appropriate to use.* * * * * /usr/bin/php /var/www/project/cron.php * * * * * /usr/bin/curl –silent –compressed http://localhost/project/cron.php * * * * * /usr/bin/wget http://localhost/project/cron.phpI hope not to bother with my question. Thanks.
Gramps
wget
We have a vendor that sends us xml feeds of new items with the appropriate path and image name. I created script that will strip the web address path for the image, but if I try to obtain these with wget I get a protocol error. I tried using wget -i list.txt. If I use other tools similar I do not have an issue. What would be causing the issue and what is best suggested?
akira
windows-xp wget
i’m downloading some measurements with wget, which are in the form ofhttp://www.somewhere.com/sub-somewhere052/image0001.jpg http://www.somewhere.com/sub-somewhere052/file0001.txt up to http://www.somewhere.com/sub-somewhere052/image0099.jpg http://www.somewhere.com/sub-somewhere052/file0099.txt The problem is, these two numbers change from test to test (this was test 052), and each test has an unknown (in advance to me) number of files. However, they are all called the same (they’re not c
Neil
windows download wget
I was trying to download a large iso (6,822,254,592 bytes). After 4,423,760,272 bytes (which is more than 4GB, so this isn’t a simple 4GB limit problem) my network connection was interrupted for long enough to prevent wget from restarting automatically, so I tried to continue it. But although wget thinks it downloaded the rest of the file, it didn’t actually change the file.HTTP request sent, awaiting response… 206 Partial Content Length: 6,822,254,592 (6.4G), 2,398,494,320 (2.2G) remaining [a
lastsmoke
wget
When I use wget (GNU Wget 1.12 built on freebsd8.2) to fetch a WordPress rss feed, it’s very slow and got unexpected first line, take www.healthsalon.org/feed/ for example, the first two lines are:2061 <?xml version=”1.0″ encoding=”UTF-8″?>The first line is unexpected, if I use browser or news reader, the feed works well, what does the first line mean? Any help is much appreciated!
guntbert
command-line firefox downloads wget
I use wget with Flashgot in Firefox to download. I can’t resume download if i pause(by ctrl+c). Because, i don’t know the commandline arguments of wget to manually add it in a terminal. So i did following steps to get those commandline arguments, when wget started downloading. After these procedures, i pressed ctrl+c to pause download:ps -A | grep -i wget and copied the pid of wget suppose it is 16760 xargs -0 echo < /proc/16760/cmdline > ls.txt (From http://stackoverflow.com/questions/821
Registered User
bash shell-scripting wget
I am trying to download flareget download manager via wget I get errorwget http://www.flareget.com/files/flareget/debs/amd64/flareget_2.3-24_amd64(stable)_deb.tar.gz bash: syntax error near unexpected token `(‘Why is that error coming and what is the solution for that?
saeed
java shell wget
I’m trying to write a graphical interface for wget. I can run shell scripts in java and read script’s output, but in case of wget the following code won’t work.Runtime rt = Runtime.getRuntime();Process proc = rt.exec(“wget someAddress”);InputStream output = proc.getInputStream();int nextByte = output.read();while(nextByte != -1){System.out.print((char)nextByte);nextByte = output.read();}so what should I do in order to get percenage of download?thanks in advance
Tshepang
wget mirror
I have tried using wget –mirror http://tshepang.net/, but it only retrieves one page, “tshepang.net/index.html”. Is this a bug in wget?Here’s the output, from using the –debug option:DEBUG output created by Wget 1.12 on linux-gnu.Enqueuing http://tshepang.net/ at depth 0 Queue count 1, maxcount 1. [IRI Enqueuing `http://tshepang.net/’ with None Dequeuing http://tshepang.net/ at depth 0 Queue count 0, maxcount 1. –2011-01-15 12:32:51– http://tshepang.net/ Resolving tshepang.net… 66.216.125
slhck
java linux shell wget apt-get
I’d like to run multicraft on one of my servers.After some intensive debugging/installing I realised the server didn’t yet have a Java Runtime Environment. I figured it would be easy to install, but that is where is was wrong.sudo apt-get install sun-java6-bin sun-java6-jresudo: apt-get: command not foundSo I figured I had to get the installer through wget but I couldn’t find an URL to do so.My Linux version is:version 2.6.18-194.8.1.el5.028stab070.4PAEroot@rhel5-build-x32gcc version 4.1.2 20080
Chinmay Kanchi
wget
When using wget in a script to download some files from Google Docs, the name of the file is not preserved. For example:wget ‘http://spreadsheets.google.com/pub?key=pyj6tScZqmEfbZyl0qjbiRQ&output=xls’ saves the file as pub?key=pyj6tScZqmEfbZyl0qjbiRQ instead of indicatorhivestimatedprevalence15-49.xls, which is what I get if I click on the link in a browser. Is there any way to enforce this “browser-like” behaviour in wget?
Lothar_Grimpsenbacher
apache2 php cron wget
Why is this cron job executing repeatedly over time and what can I do to stop it?I have a cron job that is supposed to run at 4 each morning. It hits a php script that executes some daily data analysis and under normal conditions runs once (taking about 2-3 minutes to complete) and quits. It has been working but lately it is running amuck right about the time the server crashes. I investigated and found the following. The crontab entry looks like this:* 4 * * * /usr/bin/wget -q -O /dev/null ‘htt
Bahadir Ismail Aydin
tomcat cron shell wget
I am running a web app on a Tomcat server. There is a hard-to-detect problem within the server code that causes it to crash once or twice everyday. I will dig in to correct it when I have time. But until that day, in a problematic case restarting tomcat (/etc/init.d/tomcat7 restart) or basically rebooting the machine also seem pretty good solutions for now. I want to detect liveliness of server with wget instead of grep or something else because even though tomcat is running my service my be dow
Manuel
command-line server dropbox wget
I have downloaded Dropbox manually on many instances on Ubuntu systems.Today I tried to use it on an Ubuntu 12.04 server edition, so I used the command wget -O – “https://www.dropbox.com/download?plat=lnx.x86_64” in my home directory.The problem is, the command crashes, at the moment the message “HTTP request sent, awaiting response” is last shown, and starts throwing random characters on screen, until I force it to stop. Any suggestions? Thank you in advance.Note:I’m executing this through putt
Adi Inbar
perl curl wget ping traceroute
I found a script online that I thought was going to do what I needed, but I can’t get it to work as my PERL skills are pretty low. Basically, I need to monitor this URL on apple.com and make sure the download form is available, and if it isn’t available, I need to receive an email saying that the form isn’t available from $hostname, here is the traceroute from that host. The traceroute is important because Apple uses Akamai and some GeoIP magic for their downloads.I’m open to keeping this scrip
studiohack
wget freebsd cron
I’m new to freebsd and have a problem: my script uses wget. when I call it from the command line it works fine and I see the output like this:–2011-09-22 15:55:58– http://svarog.***/monitor/reply.PHP?port_id=312342 Connecting to 83.***.***.2:3128… connected. Proxy request sent, awaiting response… 200 OK Length: 8 [text/HTML] Saving to: `reply.PHP?port_id=312342’0K 100% 1.14M=0s2011-09-22 15:55:58 (1.14 MB/s) – `reply.PHP?port_id=312342
izx
security ssl wget curl
Using a command line website downloader, such as wget, curl or any other one… In a script…I have the SHA-1 and the SHA-256 certficate fingerprint of a website. Due to security concerns (1) (2), I don’t want to use the public SSL certificate authority system. The fingerprint must be hard coded.Can a wget like application check the SSL fingerprint?wget does not have such a functionality. (3)Using wget –ca-certificate or curl –cacert I would have to run my own local certificate authority, whi
Ryley
perl ipc wget
I’m trying to stream a file from a remote website to a local command and am running into some problems when trying to detect errors. The code looks something like this:use IPC::Open3;my @cmd = (‘wget’,’-O’,’-‘,’http://10.10.1.72/index.php’);#any website will do heremy ($wget_pid,$wget_in,$wget_out,$wget_err); if (!($wget_pid = open3($wget_in,$wget_out,$wget_err,@cmd))){print STDERR “failed to run open3\n”;exit(1) } close($wget_in); my @wget_outs = <$wget_out>; my @wget_errs = <$wget_er
Fábio Silva
ssl command-line curl wget
I’ve been trying to use local hosted https urls in command line/cron jobs and i get those errors. curl output is similar. The same commands, when used in other servers works perfectly.Default[root@tejon ~]# wget -O /dev/null https://www.domain.com/cron/notification.php –2013-05-27 10:33:38– https://www.domain.com/cron/notification.php Resolving www.domain.com… 173.45.38.60 Connecting to www.domain.com|173.45.38.60|:443… connected. OpenSSL: error:140770FC:SSL routines:SSL23_GET_SERVER_HELL
Alois Mahdal
windows wget
I’m using wget to download a set of files via HTTP, using one wget call per URL, in a simple cmd.exe batch.Also, I alternate between mirrors randomly and want to keep separate tree for each mirror, like:http://server06//files/file1.txt -> temp\server06\files\file1.txt http://server03//files/file65.txt -> temp\server03\files\file65.txtWhat I do now is:echo !url! | .\runners\wget.exe –tries=3 –force-directories –directory-prefix=.\temp\ –input-file=-Sometimes it happens that, for some r
Stephan Kristyn
java html jsp img wget
I don’t know how to automatically retrieve and save an image to my harddisk given this html source:<img src=”https://someRemoteServer/servlet/GetItemServlet?ts=32352.3&itemtype=blabla”>I tried wget, but it only saves the request GetItemServlet itself to my harddisk. I want to iterate through 700 images on the remote Server (which I do not own nor have backend access to) and save them all to my harddisk with an unique identifier.Edit: The output of wget: HTTP Request sent, waiting for R

Web site is in building