Computer Hope

Microsoft => Microsoft DOS => Topic started by: iONik on November 30, 2021, 09:43:17 AM

Title: Extracting string starting with and ending with specific strings
Post by: iONik on November 30, 2021, 09:43:17 AM
OK, this is totally over my head here. Never learned all this (for /f "tokens) stuff.

I have a large text file that has many, many, many URLs embedded in it. Going through this and copying and pasting all these URLs would be quite painful, mentally and perhaps even physically, let alone very time consuming. In the end I'd like to have a new text document that lists all the URLs.

The string would start with (http) and end with ("), absent the parenthesis. Removing the (") from the result would be great but not critical.

Can anyone help me out?
I would appreciate it!

Brian
Title: Re: Extracting string starting with and ending with specific strings
Post by: Hackoo on November 30, 2021, 10:05:29 AM
This can be done easily with a vbscript or a Powershell script that use Regex.
Title: Re: Extracting string starting with and ending with specific strings
Post by: iONik on November 30, 2021, 10:18:15 AM
eeks! ...and I know even less about those languages.
Title: Re: Extracting string starting with and ending with specific strings
Post by: Hackoo on November 30, 2021, 10:30:48 AM
Can you give us a little example of the inputfile and what did you expected as result after extracting i mean the output !
We can write a script that can let the user drag and drop the inputfile over the batch script or the vbscript and get the result in another file !
So, I'm waiting from you what i've asked above !
@+
Title: Re: Extracting string starting with and ending with specific strings
Post by: iONik on November 30, 2021, 12:06:15 PM
A sample file:

"thumb_source_type":"screen","thumb_url":"","group_id":"2","screen_delay":0,"get_screen_method":"auto","need_sync_screen":0,"update_interval":"","thumb_width":728,"thumb_height":454.9101678183613,"position":6,"clicks":5,"deny":0,"screen_maked":1,"global_id":"vWNVsZDrcsdhsCGdFonDn5BkW9nRRJ","thumb_version":1,"id":14,"thumb":"filesystem:moz-extension://f65056c1-5c41-4f9e-b577-96f5fd992fb0/persistent/sd_previews/vWNVsZDrcssCGdFonDn55EBkW9nRRJ.png","rowid":14,"auto_title":"PortableApps.com - Portable software for USB, portable, and cloud drives","last_preview_update":1619905296092},{"url":"https://google.com/","title":"Unlock, speed up and easily transfer ","thumb_source_type":"screen","group_id":"2","get_screen_method":"auto","thumb":"filesystem:moz-extension://f65056c1-5c41-4f9e-b577-96f5fd992fb0/persistent/sd_previews/zN4KncDseuEPF65WBllpPVbEke9m1u.png","screen_delay":0,"position":5,"clicks":191,"deny":0,"screen_maked":1,"global_id":"zN4cDseuEPF65t6WBllpPVbEke9m1u","thumb_version":1,"id":16,"last_preview_update":1619905289896,"auto_title":"Unlock, speed up and easily transfer content from the cloud - Offcloud.com","thumb_width":728,"thumb_height":454.9101678183613,"thumb_url":"","need_sync_screen":0,"update_interval":"","rowid":16},{"url":"https://drive.google.com/drive/","title":"My Drive - Google

Sample Output file:
https://google.com/"
https://drive.google.com/drive/"

or

https://google.com/
https://drive.google.com/drive/

can't stop with (/) as some URLs have multiple .../text1/text2/ ie...subdirectories as well as different top-level-domains
Title: Re: Extracting string starting with and ending with specific strings
Post by: Hackoo on November 30, 2021, 01:25:55 PM
You can give a try for this batch file : Extracting_Links.bat
Code: [Select]
@echo off
Mode 70,4 & color 0B
Title Extracting URLs links from InputFile by Drag and Drop
Set "InputFile=%1"
If [%InputFile%] EQU [] Call :Help
Set "Tmpvbs=%Tmp%\%~n0.vbs"
Set "OutPutFile=%~n1_Output.txt"
echo( & echo(   Please wait ... Extracting URLs and Links from "%~nx1"
Call :Extract "%InputFile%" "%OutPutFile%"
Start "" "%OutPutFile%" & exit
::--------------------------------------------------------------------------------------------------------------------
:Extract <InputData> <OutPutData>
(
echo Data = WScript.StdIn.ReadAll
echo Data = Extract(Data,"(http|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:/~\+#]*[\w\-\@?^=%&amp;/~\+#])?"^)
echo WScript.StdOut.WriteLine Data
echo Function Extract(Data,Pattern^)
echo    Dim oRE,oMatches,Match,Line
echo    set oRE = New RegExp
echo    oRE.IgnoreCase = True
echo    oRE.Global = True
echo    oRE.Pattern = Pattern
echo    set oMatches = oRE.Execute(Data^)
echo    If not isEmpty(oMatches^) then
echo        For Each Match in oMatches 
echo            Line = Line ^& Match.Value ^& vbcrlf
echo        Next
echo        Extract = Line
echo    End if
echo End Function
)>"%Tmpvbs%"
cscript /nologo "%Tmpvbs%" < "%~1" > "%~2"
If Exist "%Tmpvbs%" Del "%Tmpvbs%"
exit /b
::--------------------------------------------------------------------------------------------------------------------
:Help
Color 0C
echo(
echo(   You should drag and drop a file over,
echo(   this script "%~nx0" in order to extract URLs and Links
Timeout /T 10 /NoBreak>nul
Exit
::--------------------------------------------------------------------------------------------------------------------
8) ;)
Title: Re: Extracting string starting with and ending with specific strings
Post by: iONik on November 30, 2021, 01:48:10 PM
I N S A N E !

This script would have taken me 10x longer to write than to manually extract all the websites.
Worked perfectly!

Thank You!
You don't know how much I appreciate your help.
Brian
 ;D
Title: Re: Extracting string starting with and ending with specific strings
Post by: Squashman on December 04, 2021, 09:51:17 AM
Putting this link to the O/P's other questions on the provided code here for posterity.

Inserting extracted code into new file surrounded by specific text  (https://stackoverflow.com/questions/70222069/inserting-extracted-code-into-new-file-surrounded-by-specific-text)