Welcome guest. Before posting on our computer help forum, you must register. Click here it's easy and free.

Author Topic: help with manipulating url  (Read 2694 times)

0 Members and 1 Guest are viewing this topic.

turbodiesel

    Topic Starter


    Greenhorn

    • Yes
  • Experience: Experienced
  • OS: Windows XP
help with manipulating url
« on: November 14, 2009, 12:26:13 AM »
I have a list of 500 similar urls.
approximately 490 of these link to the same page.
How can i find the remaining 10 from the list which do not take me to this page?
(without obviously clicking on each one)

TIA

Turbodiesel

Salmon Trout

  • Guest
Re: help with manipulating url
« Reply #1 on: November 14, 2009, 01:18:33 AM »

Salmon Trout

  • Guest
Re: help with manipulating url
« Reply #2 on: November 14, 2009, 01:26:14 AM »

From the command line you can use find.exe. The /v switch excludes lines which contain the search string. You can use the > filename syntax to redirect the output to a file.

old.txt

http://www.cat.com/collars.htm
http://www.cat.com/fish.htm
http://www.dog.com/bones.htm
http://www.cat.com/milk.htm
http://www.cat.com/scratch.htm
http://www.dog.com/collars.htm

type old.txt | find /v "www.cat.com" > new.txt

only lines from old.txt which do not contain www.cat.com will end up in new.txt

If you remove the /v then only lines which do contain www.cat.com will be copied

Code: [Select]
C:\>type old.txt
http://www.cat.com/collars.htm
http://www.cat.com/fish.htm
http://www.dog.com/bones.htm
http://www.cat.com/milk.htm
http://www.cat.com/scratch.htm
http://www.dog.com/collars.htm

Code: [Select]
C:\>type old.txt | find /v "www.cat.com"
http://www.dog.com/bones.htm
http://www.dog.com/collars.htm

Code: [Select]
C:\>type old.txt | find "www.cat.com"
http://www.cat.com/collars.htm
http://www.cat.com/fish.htm
http://www.cat.com/milk.htm
http://www.cat.com/scratch.htm

Code: [Select]
C:\>type old.txt | find "collars"
http://www.cat.com/collars.htm
http://www.dog.com/collars.htm
« Last Edit: November 14, 2009, 01:43:52 AM by Salmon Trout »

turbodiesel

    Topic Starter


    Greenhorn

    • Yes
  • Experience: Experienced
  • OS: Windows XP
Re: help with manipulating url
« Reply #3 on: November 15, 2009, 08:34:03 AM »
Thanks for taking the time to reply m8 :)
I am refering to the destination page rather then the actual content of the original url.
For example....

http://www.cat.com/collars/page1
http://www.cat.com/collars/page2
http://www.cat.com/collars/page3
v
v
v
v
http://www.cat.com/collars/page499
http://www.cat.com/collars/page500


490 of these urls will redirect me to a page displaying a picture of a cat

10 of these urls will redirect me to 10 different pages displaying different 10 different images

How do find the 10 "different" links without clicking on each of the 500 links to see where it takes me?

Spoiler



    Specialist

    Thanked: 50
  • Experience: Beginner
  • OS: Windows XP
Re: help with manipulating url
« Reply #4 on: November 16, 2009, 09:20:05 AM »
You really can't do this. You have to preview the page to know what you want to keep.

And even if the URL works for you today there is nothing stopping the web site to change the URL to show up as one of the other 500 you already have....

Is this one of those sites that have a picture of the day thing going on?

Whenever I watch TV and I see those poor starving kids all over the world, I can't help but cry. I mean I would love to be skinny like that, but not with all those flies and death and stuff." - Mariah Carey, Pop Singer

Salmon Trout

  • Guest
Re: help with manipulating url
« Reply #5 on: November 16, 2009, 10:39:53 AM »
Could play around with wget