Computer Hope

Microsoft => Microsoft DOS => Topic started by: PhilD on April 25, 2017, 01:18:00 PM

Title: remove lines in txt files
Post by: PhilD on April 25, 2017, 01:18:00 PM
Windows 7

I have a number of folders each containing a lot of text files.  In each file I want to remove lines up to and including a line starting with 'xxxxxxx'.  I would prefer that the edited files be saved in a folder other than the original folder, i.e. I don't want to lose the original files, although I do have backup copies of the files.

Ideally I would prefer a freeware utility with a GUI. I tried to find something, but everything I found was overkill and/or quite expensive for occasional use.  FWIW : a friend recently recommended Bulk Renamer Utility (BRU) to do some tricky renaming of the text files, which I've done.  If someone knows of a freeware utility of similar calibre to BRU to do the bulk editing, that would be ideal. 

Otherwise a batch command file would be OK. 

I've never been any good at writing any sort of computer code. How anyone can comprehend, let alone type, so-called 'regular  expressions' always astonishes me.  But I am pretty good with cut and paste  ;D

Thanks - PhilD   
Title: Re: remove lines in txt files
Post by: PhilD on April 25, 2017, 02:12:19 PM
Addenda

If the 'xxxxxxxx' string isn't present
PhilD
Title: Re: remove lines in txt files
Post by: Hackoo on April 25, 2017, 03:28:58 PM
You should be more explicit if you show us with an example file to explain more your aim !
Title: Re: remove lines in txt files
Post by: patio on April 25, 2017, 04:10:58 PM
Welcome to the DOS section...Hint:...get used to it.
Title: Re: remove lines in txt files
Post by: PhilD on April 25, 2017, 05:18:04 PM
I wasn't sure where to post this, but I anticipated a batch script solution so I thought DOS Box - that'll do.

Sample file attached - can't attach a real one, not my property and contains private data.  There are thousands of files, they are not large files, 30-1000KB each.   

In each text file in an Input folder I want to delete everything down the line that starts with say Purpose: and write the edited file to an Output folder, if Purpose: is not found write the file to an Exceptions folder.

So a command might look something  like

Code: [Select]
EditTextFiles -i="d:\textdata\first set of files" -s="Purpose:", -o="d:\textdata\first set of files\edited", -e="d:\textdata\first set of files\exceptions"

where

     -i is the input directory/folder;
     -s is the search string;
     -o is the output directory/folder;
     -e is the exceptions directory/folder.


The person who 'owns' the data is a competent computer user, but not a computer geek, hence a preference for a GUI based tool.  But a batch script will do.

PhilD



[attachment deleted by admin to conserve space]
Title: Re: remove lines in txt files
Post by: Geek-9pm on April 25, 2017, 08:29:55 PM
Once you have a thing that works, you can 'decorate' it with Vb Script.
You can use Visual Basic to create VB Scripts that will run with batch files.
You might look at this:
https://social.msdn.microsoft.com/Forums/en-US/6cf20733-75c4-4018-81dc-22369020e492/creating-a-gui-with-visual-basic?forum=Vsexpressvb
Also this:
http://www.instructables.com/id/How-to-Make-a-message-box-using-VBScript/
And:
https://community.spiceworks.com/topic/93619-vbscript-to-gui-or-to-visual-basic

Of course, none of the above are specific to what you are doing.  But almost anything you do in batch cam can  made into a GUI program.A VB Script can invoke a command line program and get the results sent to a file.

Some users hate using anything related to Visual Basic. Nevertheless, Visual Basic and is offspring have done and still  do a lot of work in the IT industry.  :)

https://en.wikipedia.org/wiki/Visual_Basic
Quote
On April 8, 2008, Microsoft stopped supporting Visual Basic 6.0 IDE.
...
In 2014, some software developers still preferred Visual Basic 6.0 over its successor, Visual Basic .NET.[3][9] In 2014 some developers lobbied for a new version of Visual Basic 6.0.[10][11][12][13] In 2016, Visual Basic 6.0 won the technical impact award at The 19th Annual D.I.C.E. Awards.[14][15][16] A dialect of Visual Basic, Visual Basic for Applications (VBA), is used as a macro or scripting language within several Microsoft applications, including Microsoft Office.[17]
Some early versions of Visual Basic are easy to learn and are compatible with current versions of Windows. Here is a link for the free 2008 version:
http://www.freewarefiles.com/Microsoft-Visual-Basic-Express-Edition_program_17931.html    ;D

Title: Re: remove lines in txt files
Post by: PhilD on April 25, 2017, 09:52:51 PM
Once you have a thing that works, you can 'decorate' it with Vb Script.
Thanks - but I'm a minimalist - less is better, least is best.  This is a short term need for dealing with the thousands of archived text files. The ongoing updates won't come from text files, they will go directly into the database as LOB transactions.

For the job at hand a batch file solution will perfectly adequate.

PhilD


Title: Re: remove lines in txt files
Post by: Geek-9pm on April 25, 2017, 11:11:15 PM
OK. You are the one who keeps adding things.
But you fest said:
Quote
  In each file I want to remove lines up to and including a line starting with 'xxxxxxx'. 
That is almost trivia. The pseudo code is:

open file.
begin loop
  read a line
  If line starts with  'xxxxxxx' exit loop
  write line to output
end loop
close

Is that what you want?


Title: Re: remove lines in txt files
Post by: PhilD on April 26, 2017, 12:21:42 AM
OK. You are the one who keeps adding things.
But you fest said:That is almost trivia. The pseudo code is:

open file.
begin loop
  read a line
  If line starts with  'xxxxxxx' exit loop
  write line to output
end loop
close

Is that what you want?
What I wrote was
I have a number of folders each containing a lot of text files.  In each file I want to remove lines up to and including a line starting with 'xxxxxxx'. 

I did add a post because I forgot the exception condition of the xxxxxxx not being found

I provided a sample and clarification as requested by Hackoo.

I responded Geek_9s suggestion that I write something in Basic. 

If its batch file I want to be able specify the input folder, the search string, the output folder and the exceptions folder in a command line, in essence I don't want those things hard coded in the batch file.

So assuming xxxxxx is 'Purpose:" for file shown in the attachment I do not want lines 1 to  15 to be written, the output would consist of lines 16 until the end of file

If the xxxxxx isn't found copy the file to an exceptions folder

And I want it to operate on every .txt file in the input folder.

I appreciate it is not very hard for you, but at 83 I find coding hard. 

Thanks PhilD









[attachment deleted by admin to conserve space]
Title: Re: remove lines in txt files
Post by: Geek-9pm on April 26, 2017, 01:05:55 AM
You are 83? Now I understand why you want it simple.
I am 78 and get tired very easy. For me, it works best if I can divide things into little parts. Tomorrow I will look at this again and if nobody has given you a batch file, I will try to do something in a script.   :)
Title: Re: remove lines in txt files
Post by: PhilD on April 26, 2017, 01:37:33 AM
You are 83? Now I understand why you want it simple.
I am 78 and get tired very easy. For me, it works best if I can divide things into little parts. Tomorrow I will look at this again and if nobody has given you a batch file, I will try to do something in a script.   :)
Thanks pal - had mild stroke a few years ago, it zapped any coding skills I ever had. 

Title: Re: remove lines in txt files
Post by: Hackoo on April 26, 2017, 01:46:16 AM
Hi  ;)
I made this script just for testing for one file ! and you should confirm me if this what you want for one file or not ?
Code: [Select]
@echo off
Set "Stop_String=Purpose"
Set "File=Sample.txt"
Set "OutPutFile=OutPutFile.txt"
if exist "%OutPutFile%" Del "%OutPutFile%"
set /A "Count=0"

For /f "tokens=1 delims=:" %%A in ('findstr /I /N "%Stop_String%" "%File%"') Do (
Set /a "Count=%%A"
)
echo We found the string "%Stop_String%" at line number %Count%
pause & Cls

For /f "skip=%Count% delims=*" %%a in ('Type "%File%"') do (
echo %%a 
echo %%a >> "%OutPutFile%"
)
echo.
Echo Hit any key to open the output file ...
pause>nul
Start "" "%OutPutFile%" & exit
Title: Re: remove lines in txt files
Post by: PhilD on April 26, 2017, 02:24:34 AM
Hi  ;)
I made this script just for testing for one file ! and you should confirm me if this what you want for one file or not ?
Thanks a lot Hackoo :)

Yeah it finally worked - my blunder :||

I didn't copy all the code, I don't see many SMF forums

PD



Title: Re: remove lines in txt files
Post by: Hackoo on April 26, 2017, 11:56:10 AM
Hi  ;)
This an update version with Browse for folder  ;D
Just give a try and tell me the result  :)
Code: [Select]
@echo off
Title Search for files and remove lines in text files
mode con cols=75 lines=5 & Color 0A
REM We set the variable Folder with the function Browse4Folder
Call :Browse4Folder "Choose source folder to scan for files" "c:\scripts"
Set "Folder=%Location%"
Rem if the variable %Folder% have a trailing back slash, we remove it !
IF %Folder:~-1%==\ SET "Folder=%Folder:~0,-1%"
If "%errorlevel%" EQU "0" (
echo( & echo(
echo You choose this folder "%Folder%"
Timeout /T 2 /nobreak>nul
) else (
echo( & echo(
echo  "%Folder%"
Timeout /T 2 /nobreak>nul & Exit
)
Set "Stop_String=Purpose"
Set "Edited_Files=%HomeDrive%\Edited_Files"
Set "Exceptions_Folder=%HomeDrive%\Exceptions"
If not exist "%Edited_Files%" MD "%Edited_Files%"
If not exist "%Exceptions_Folder%" MD "%Exceptions_Folder%"
SetLocal EnableDelayedExpansion
for /f "delims=" %%f in ('Dir /b /s "%Folder%\*.txt"') Do (
Cls
echo.
set /A "Count=0"
set "InputFile=%%~dpFf"
findstr /I /C:"%Stop_String%" "!InputFile!" >nul 2>&1
If "!ErrorLevel!" EQU "0" (
Call :Counting "%Stop_String%" "!InputFile!"
echo We found the string "%Stop_String%" at line number !Count! on "!InputFile!"
Timeout /T 4 /nobreak>nul
Set "OutPutFile=%Edited_Files%\%%~nf_edited.txt"
if exist "!OutPutFile!" Del "!OutPutFile!"
Call :Write2File "!InputFile!" "!OutPutFile!"
) else (
Call :Copyfiles "!InputFile!" "!Exceptions_Folder!"
)
)
Explorer "%Edited_Files%" & exit
::********************************************************************
:Counting <Stop_String> <InputFile>
For /f "tokens=1 delims=:" %%A in ('findstr /I /N "%~1" "%~2"') Do (
Set /a "Count=%%A"
)
exit /b
::********************************************************************
:Write2File <InputFile> <OutPutFile>
For /f "skip=%Count% delims=*" %%a in ('Type "%~1"') do (
echo %%a >> "%~2"
)
exit /b
::********************************************************************
:Browse4Folder
set Location=
set vbs="%temp%\_.vbs"
set cmd="%temp%\_.cmd"
for %%f in (%vbs% %cmd%) do if exist %%f del %%f
for %%g in ("vbs cmd") do if defined %%g set %%g=
(
    echo set shell=WScript.CreateObject("Shell.Application"^)
    echo set f=shell.BrowseForFolder(0,"%~1",0,"%~2"^)
    echo if typename(f^)="Nothing" Then
    echo wscript.echo "set Location=Dialog Cancelled"
    echo WScript.Quit(1^)
    echo end if
    echo set fs=f.Items(^):set fi=fs.Item(^)
    echo p=fi.Path:wscript.echo "set Location=" ^& p
)>%vbs%
cscript //nologo %vbs% > %cmd%
for /f "delims=" %%a in (%cmd%) do %%a
for %%f in (%vbs% %cmd%) do if exist %%f del /f /q %%f
for %%g in ("vbs cmd") do if defined %%g set %%g=
goto :eof
::********************************************************************
:Copyfiles <Source> <Target>
Copy /Y "%~1" "%~2" >nul 2>&1
goto :eof
::********************************************************************
Title: Re: remove lines in txt files
Post by: PhilD on April 26, 2017, 02:38:15 PM
Hi  ;)
This an update version with Browse for folder  ;D
Just give a try and tell me the result  :)
Hi Hakoo - I'm on to it  - Thanks * 80

I'll be back soon as possible








Title: Re: remove lines in txt files
Post by: PhilD on April 26, 2017, 03:10:54 PM
Hi Hakoo - I'm on to it  - Thanks * 80

I'll be back soon as possible
I'm back - fan effing fantastic ;D ;D ;D

Couple of changes would make it perfect

Thanks heaps 

PhilD





 










Title: Re: remove lines in txt files
Post by: Geek-9pm on April 26, 2017, 06:04:30 PM
Tmeout is found in a resource kit.
https://ss64.com/nt/timeout.html

Syntax
      TIMEOUT delay [/nobreak]

Key
   delay  Delay in seconds (between -1 and 100000) to wait before continuing.
          The value -1 causes the computer to wait indefinitely for a keystroke
          (like the PAUSE command)

   /nobreak
          Ignore user key strokes. (Windows 7 or greater)

Title: Re: remove lines in txt files
Post by: PhilD on April 26, 2017, 08:13:35 PM
I'm back - fan effing fantastic ;D ;D ;D

Couple of changes would make it perfect
  • Could you write the output to a subfolder (Edited Files) of the input folder,
  • ditto the Exceptions folder
  • The output files must keep the same names - they are used in the next step, for data matching
@Hakoo - could we add three 'nice to have' changes  ;)
Thanks again PhilD
Title: Re: remove lines in txt files
Post by: PhilD on April 27, 2017, 12:25:52 AM
@Hakoo

Hi - we have run a couple of hundred files though the program and found one problem

It seems the files sometimes (~5%) have "Purpose" repeated, but these need to be retained - so once the first one is found your program needs to stop looking for the Stop_String.

For the attached sample input file, the output should start at line 18 and continue through line to 29

Thanks PhilD

[attachment deleted by admin to conserve space]
Title: Re: remove lines in txt files
Post by: Hackoo on April 27, 2017, 07:39:07 AM
Sorry, i'm busy so i can't reply to you quickly !
Just give a try for this modification :
Code: [Select]
@echo off
Title Search for files and remove lines in text files
mode con cols=75 lines=5 & Color 0A
REM We set the variable Folder with the function Browse4Folder
Call :Browse4Folder "Choose source folder to scan for files" "c:\scripts"
Set "Folder=%Location%"
Rem if the variable %Folder% have a trailing back slash, we remove it !
IF %Folder:~-1%==\ SET "Folder=%Folder:~0,-1%"
If "%errorlevel%" EQU "0" (
echo( & echo(
echo You choose this folder "%Folder%"
Timeout /T 2 /nobreak>nul
) else (
echo( & echo(
echo  "%Folder%"
Timeout /T 2 /nobreak>nul & Exit
)
Set "Stop_String=Purpose"
Set "Edited_Files=%HomeDrive%\Edited_Files"
Set "Exceptions_Folder=%HomeDrive%\Exceptions"
If not exist "%Edited_Files%" MD "%Edited_Files%"
If not exist "%Exceptions_Folder%" MD "%Exceptions_Folder%"
SetLocal EnableDelayedExpansion
for /f "delims=" %%f in ('Dir /b /s "%Folder%\*.txt"') Do (
Cls
echo.
set /A "Count=0"
set "InputFile=%%~dpFf"
findstr /I /C:"%Stop_String%" "!InputFile!" >nul 2>&1
If "!ErrorLevel!" EQU "0" (
Call :Counting "%Stop_String%" "!InputFile!"
echo We found the string "%Stop_String%" at line number !Count! on "!InputFile!"
Timeout /T 4 /nobreak>nul
Set "OutPutFile=%Edited_Files%\%%~nxf"
if exist "!OutPutFile!" Del "!OutPutFile!"
Call :Write2File "!InputFile!" "!OutPutFile!"
) else (
Call :Copyfiles "!InputFile!" "!Exceptions_Folder!"
)
)
Rem Explorer "%Edited_Files%" & exit
echo(
set "Title=Search for files and remove lines in text files"
set "Msg=%Title%\nEnd of the script"
Call :Speak "%Msg%"
Rem 64=vbInformation, 48=vbExclamation, 16=vbCritical 32=vbQuestion
Call :MsgBox "%Msg%" 64 "%Title%"
exit
::********************************************************************
:Counting <Stop_String> <InputFile>
For /f "tokens=1 delims=:" %%A in ('findstr /I /N "%~1" "%~2"') Do (
Set /a "Count=%%A"
)
exit /b
::********************************************************************
:Write2File <InputFile> <OutPutFile>
For /f "skip=%Count% delims=*" %%a in ('Type "%~1"') do (
echo %%a >> "%~2"
)
exit /b
::********************************************************************
:Browse4Folder
set Location=
set vbs="%temp%\_.vbs"
set cmd="%temp%\_.cmd"
for %%f in (%vbs% %cmd%) do if exist %%f del %%f
for %%g in ("vbs cmd") do if defined %%g set %%g=
(
    echo set shell=WScript.CreateObject("Shell.Application"^)
    echo set f=shell.BrowseForFolder(0,"%~1",0,"%~2"^)
    echo if typename(f^)="Nothing" Then
    echo wscript.echo "set Location=Dialog Cancelled"
    echo WScript.Quit(1^)
    echo end if
    echo set fs=f.Items(^):set fi=fs.Item(^)
    echo p=fi.Path:wscript.echo "set Location=" ^& p
)>%vbs%
cscript //nologo %vbs% > %cmd%
for /f "delims=" %%a in (%cmd%) do %%a
for %%f in (%vbs% %cmd%) do if exist %%f del /f /q %%f
for %%g in ("vbs cmd") do if defined %%g set %%g=
goto :eof
::********************************************************************
:Copyfiles <Source> <Target>
Copy /Y "%~1" "%~2" >nul 2>&1
goto :eof
::********************************************************************
:MsgBox <Msg> <Type> <Title>
echo MsgBox Replace("%~1","\n",vbCrLf),"%~2","%~3" > "%tmp%\%~n0.vbs"
Cscript /nologo "%tmp%\%~n0.vbs" & Del "%tmp%\%~n0.vbs"
exit /b
::********************************************************************
:Speak <msg>
(
    echo Set sapi=Createobject("sapi.spvoice"^)
    echo sapi.Speak("%~1"^)
)>"%tmp%\%~n0.vbs"
Cscript /nologo "%tmp%\%~n0.vbs"
Del "%tmp%\%~n0.vbs"
exit /b
::*********************************************************************
Title: Re: remove lines in txt files
Post by: PhilD on April 27, 2017, 02:32:20 PM
Sorry, i'm busy so i can't reply to you quickly !
Just give a try for this modification :
Hi Hackoo

Thanks for the changes, but the issue that really matters is the one I tried to explain in my last post. 

The attached zip contains a Sample .txt file which has "Purpose:" at lines 15 and 21

And a Sample Expected.txt file that I created using a text editor, in which I removed lines 1-15 but retained everything after that, including line 21

So once the first Stop_String is found, the program can write the rest of the input file to the output file, and move on to the next file.

I hope that makes sense

PhilD



[attachment deleted by admin to conserve space]
Title: Re: remove lines in txt files
Post by: PhilD on April 27, 2017, 05:07:01 PM
Hackoo - on the Sample.txt I never see see the message
Code: [Select]
We found the string Purpose at [b]line number 15[/b] on Sample.txt
but I do see the message
Code: [Select]
We found the string Purpose at [b]line number 21[/b] on Sample.txt
I assumed I would see both   ???

PhilD