Welcome guest. Before posting on our computer help forum, you must register. Click here it's easy and free.

Author Topic: HTTRACK - Rule to only grab .zip files  (Read 4197 times)

0 Members and 1 Guest are viewing this topic.

DaveLembke

    Topic Starter


    Sage
  • Thanked: 662
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
HTTRACK - Rule to only grab .zip files
« on: December 26, 2016, 12:16:36 PM »
Im still at work and cant test this but was wondering if anyone ever used HTTRACK with filtering to know if this would work or not. There is a website that has Unreal Tournament 99 addon game maps and there are way to many links to manually download each zip file one by one. I started the website copier/crawler last night after first testing at what level the .zip files are located at in which they are at the 3rd level in, so i set that limitation to only go 3 levels deep, and as of this morning i grabbed 12GB of data through this website. My guess is that I have a bunch of bloat that I dont need coming along with the .zip files. So I want to go at this with a means that is not so hard hitting to someones web server for this website that selectively grabs just the .zip files that contain the maps that were created by people for the game with map editor.

Below I am not sure if this will hit the first instruction that says refuse all and then it will skip past the next filter rule or not. I am hoping that it will work like a batched filter where the first filter rule says dont copy any files, and then the next rule then adds the exception to only the .zip files that i want. I am not sure if you can have a filter set up with only copy .zip files in the same filter instruction or not.

From the website below I put together this filter but not sure if it will work or not as I want it to. In a few hours I can test it out, but decided a discussion on this might yield some useful info vs trial and error of myself trying what might work filter rules.


-*.* will refuse all files
+*.zip will accept .zip files for download


https://www.httrack.com/html/filters.html

patio

  • Moderator


  • Genius
  • Maud' Dib
  • Thanked: 1769
    • Yes
  • Experience: Beginner
  • OS: Windows 7
Re: HTTRACK - Rule to only grab .zip files
« Reply #1 on: December 26, 2016, 12:29:56 PM »
Never used the app...but back when i used XTree the *.extension always worked as a wildcard search so it makes sense it would work for you in this application...
" Anyone who goes to a psychiatrist should have his head examined. "

DaveLembke

    Topic Starter


    Sage
  • Thanked: 662
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
Re: HTTRACK - Rule to only grab .zip files
« Reply #2 on: December 26, 2016, 01:25:34 PM »
Digging deeper I think I might be able to have the filter on a single line:

*[file] or *[name]    any filename or name, e.g. not /,? and ; characters

So maybe this is the proper filter:

-*.* or +*.zip


I was thinking the || OR && AND but they show it as OR instead of || in a not quite the same example to what i am looking to achieve so it looks like it understand the word OR. If this is the case then I dont need 2 rules, it would be a single filter rule on one line so I wouldnt have to worry about it seeing -*.* and not stepping to the next filter rule, as for it should run everything on that filter line before exiting its loop to step to the next file to test the filter against until all files have been exhausted on the site that is targeted.


DaveLembke

    Topic Starter


    Sage
  • Thanked: 662
  • Certifications: List
  • Computer: Specs
  • Experience: Expert
  • OS: Windows 10
Re: HTTRACK - Rule to only grab .zip files
« Reply #3 on: December 26, 2016, 03:26:04 PM »
Well it doesnt like the filter... according to the note in the software itself it says you separate filter rules by a single space so it would be

 -*.* +*.zip

However the filter instruction of -*.* does what I worried about and it hits that rule and it doesnt matter what follows. I tried reversing the filter rule as

+*.zip -*.* and it seems as though the first rule is cancelled as the next rule says no files.


I guess I am going to just copy all and then strip the .zip files from it all in the end and delete the rest. Looking at what has copied so far most of the data is in the .zip files and not many images etc, so I guess I will run it filterless with a depth set to 3.

Website I am getting my maps from is this site: http://ut99maps.gamezoo.org/

Way too many to click and download 1 by 1. I wish they had a zip containing all for just 1 download to go with vs having to use the HTTRACK method to getting all the custom created maps.