Welcome guest. Before posting on our computer help forum, you must register. Click here it's easy and free.

Author Topic: Get selective columns from a dump file using MS DOS and make new dump  (Read 24158 times)

0 Members and 1 Guest are viewing this topic.

tanvir.khan

    Topic Starter


    Rookie

    • Experience: Beginner
    • OS: Windows 7
    Hi:
    I've a dump file in text format (pipe separator). I need to select specific columns from the file and make a new file. For Example, 

    Original Dump (I need to select column 3, 4 and 6; and make a new dump). How can I do it in MS DOS using a batch file?
    ----------------
    AAA|BBB|CCC|DDD|EEEE|FFFF|GGGG|HHHH
    TTT|MMM|KKK|123|FFFF|VVVV|IIII|BBBB
    XXX|SSS|AAA|ZZZ|TTTT|JJJJ|RRRR|YYYY

    Output Dump
    ---------------
    CCC|DDD|FFFF
    KKK|123|VVVV
    AAA|ZZZ|JJJJ

    Please help.

    Tanvir

    Squashman



      Specialist
    • Thanked: 134
    • Experience: Experienced
    • OS: Other
    Re: Get selective columns from a dump file using MS DOS and make new dump
    « Reply #1 on: September 03, 2014, 08:29:06 AM »
    Assuming you are using CMD.exe from Windows and not actually a version of MS DOS.
    Code: [Select]
    @echo off
    for /F "tokens=3,4,6 delims=|" %%G in (file.txt) do echo %%G^|%%H^|%%I>>newfile.txt

    tanvir.khan

      Topic Starter


      Rookie

      • Experience: Beginner
      • OS: Windows 7
      Re: Get selective columns from a dump file using MS DOS and make new dump
      « Reply #2 on: September 04, 2014, 04:59:03 AM »
      Thanks Squashman. Your post is really helpful. But I need one more help, I need to extract file with any name (*.txt or file*) instead of a static name (file.txt).

      @echo off
      for /F "tokens=3,4,6 delims=|" %%G in (file.txt) do echo %%G^|%%H^|%%I>>newfile.txt

      I've tried with *.txt or file* but it doesn't work. Could you please let me know how to resolve it.

      Thanks,Tanvir

      tanvir.khan

        Topic Starter


        Rookie

        • Experience: Beginner
        • OS: Windows 7
        Re: Get selective columns from a dump file using MS DOS and make new dump
        « Reply #3 on: September 04, 2014, 05:59:43 AM »
        I've found a problem in the script. It doesn't recognize blank field. Please help me in this issue as well. Picked column 3, 4 and 6. For Example:

        INPUT:
        --------
        VVVVV|BBBB|3212|9M9M|7777|MMMM|899
        sssss||4444|AAAA|3333|LLLL|101
        RRRR|PPPP|5555|1111|5555|PPPP|770

        It returns OUTPUT:
        ---------------------
        3212|9M9M|MMMM
        AAAA|3333|101
        5555|1111|PPPP

        Actual Output should be:
        ---------------------------
        3212|9M9M|MMMM
        4444|AAAA|LLLL
        5555|1111|PPPP

        Thanks, Tanvir

        Squashman



          Specialist
        • Thanked: 134
        • Experience: Experienced
        • OS: Other
        Re: Get selective columns from a dump file using MS DOS and make new dump
        « Reply #4 on: September 04, 2014, 07:22:42 AM »
        I've found a problem in the script. It doesn't recognize blank field. Please help me in this issue as well. Picked column 3, 4 and 6. For Example:

        INPUT:
        --------
        VVVVV|BBBB|3212|9M9M|7777|MMMM|899
        sssss||4444|AAAA|3333|LLLL|101
        RRRR|PPPP|5555|1111|5555|PPPP|770

        It returns OUTPUT:
        ---------------------
        3212|9M9M|MMMM
        AAAA|3333|101
        5555|1111|PPPP

        Actual Output should be:
        ---------------------------
        3212|9M9M|MMMM
        4444|AAAA|LLLL
        5555|1111|PPPP

        Thanks, Tanvir
        Not a problem with the script.  It is more of a problem with your data.  The FOR /F command always sees consecutive delimiters as one. If you would have provided that information upfront, I could have coded for that instance.  That is the main problem with providing obfuscated data.  It is better to provide real world examples up front.

        tanvir.khan

          Topic Starter


          Rookie

          • Experience: Beginner
          • OS: Windows 7
          Re: Get selective columns from a dump file using MS DOS and make new dump
          « Reply #5 on: September 08, 2014, 04:19:19 AM »
          Thanks for your feedback. As I've said the script works fine if all the fields have value like below: (real time example)

          SUBCOS|RATE|DATE|ACCSTATE|PREVBALANCE|ACTTIME|USERSTARE|RATEPLAN|EXPIRY|
          2|30|20150830|0|0|20160226235959|11|3|20170101000000|
          2|44|20140329|0|0|20140923235959|12|4|20140328003330|
          2|20|20140722|2|0|20150118235959|11|3|20140518151446|

          But if few fields are NULL, i.e. in second row there is no RATE (3, 4 & 6 columns are picked)

          SL|RATE|DATE|ACCSTATE|PREVBALANCE|ACTTIME|USERSTARE|RATEPLAN|EXPIRY|
          2|30|20150830|55|99|20160226235959|11|3|20170101000000|
          2||20140329|33|90|20140923235959|12|4|20140328003330|
          2|20|20140722|22|91|20150118235959|11|3|20140518151446|

          Output comes:
          ----------------
          DATE|ACCSTATE|ACTTIME|
          20150830|55|20160226235959|  ----Ok
          33|90|12|                                  -----As RATE field is blank, insted of picking DATE column, it picks ACCSTATE as column 3 which actually is 4
          20140722|22|20150118235959|  -----Ok   

          Please let me know if you need any information.

          Thanks, Tanvir

          foxidrive



            Specialist
          • Thanked: 268
          • Experience: Experienced
          • OS: Windows 8
          Re: Get selective columns from a dump file using MS DOS and make new dump
          « Reply #6 on: September 08, 2014, 04:56:47 AM »
          This uses dbenham's ParseCSV.bat to reformat the file and then Squashman's code to output the data.

          Save both batch files into the same folder as file.csv and run this one.

          Code: [Select]
          @echo off
          call ParseCSV.bat "/i:|" "/o:|" <file.csv >file.csv.tmp
          for /F "tokens=3,4,6 delims=|" %%G in (file.csv.tmp) do >>newfile.csv echo %%~G^|%%~H^|%%~I
          del file.csv.tmp




          ParseCSV.bat

          Code: [Select]
          @if (@X)==(@Y) @end /* harmless hybrid line that begins a JScrpt comment

          ::************ Documentation ***********
          ::parseCSV.bat version 1.0
          :::
          :::parseCSV  [/option]...
          :::
          :::  Parse stdin as CSV and write it to stdout in a way that can be safely
          :::  parsed by FOR /F. All columns will be enclosed by quotes so that empty
          :::  columns may be preserved. It also supports delimiters, newlines, and
          :::  quotes within quoted values. Two consecutive quotes within a quoted value
          :::  are converted into one quote.
          :::
          :::  Available options:
          :::
          :::    /I:string = Input delimiter. Default is a comma.
          :::
          :::    /O:string = Output delimiter. Default is a comma.
          :::
          :::    /E = Encode output delimiter in value as \D
          :::         Encode newline in value as \N
          :::         Encode backslash in value as \S
          :::
          :::    /D = Escape exclamation point and caret for delayed expansion
          :::         ! becomes ^!
          :::         ^ becomes ^^
          :::
          :::parseCSV  /?
          :::
          :::  Display this help
          :::
          :::parseCSV  /V
          :::
          :::  Display the version of parseCSV.bat
          :::
          :::parseCSV.bat was written by Dave Benham. Updates are available at the original
          :::posting site: http://www.dostips.com/forum/viewtopic.php?f=3&t=5702
          :::

          ::************ Batch portion ***********
          @echo off
          if "%~1" equ "/?" (
            setlocal disableDelayedExpansion
            for /f "delims=: tokens=*" %%A in ('findstr "^:::" "%~f0"') do echo(%%A
            exit /b 0
          )
          if /i "%~1" equ "/V" (
            for /f "delims=:" %%A in ('findstr /bc:"::%~nx0 version " "%~f0"') do echo %%A
            exit /b 0
          )
          cscript //E:JScript //nologo "%~f0" %*
          exit /b 0


          ************ JScript portion ***********/
          var args     = WScript.Arguments.Named,
              stdin    = WScript.Stdin,
              stdout   = WScript.Stdout,
              escape   = args.Exists("E"),
              delayed  = args.Exists("D"),
              inDelim  = args.Exists("I") ? args.Item("I") : ",",
              outDelim = args.Exists("O") ? args.Item("O") : ",",
              quote    = false,
              ln, c, n;
          while (!stdin.AtEndOfStream) {
            ln=stdin.ReadLine();
            if (!quote) stdout.Write('"');
            for (n=0; n<ln.length; n++ ) {
              c=ln.charAt(n);
              if (c == '"') {
                if (quote && ln.charAt(n+1) == '"') {
                  n++;
                } else {
                  quote=!quote;
                  continue;
                }
              }
              if (c == inDelim && !quote) c='"'+outDelim+'"';
              if (escape) {
                if (c == outDelim) c="\\D";
                if (c == "\\") c="\\S";
              }
              if (delayed) {
                if (c == "!") c="^!";
                if (c == "^") c="^^";
              }
              stdout.Write(c);
            }
            stdout.Write( (quote) ? ((escape) ? "\\N" : "\n") : '"\n' );
          }

          Squashman



            Specialist
          • Thanked: 134
          • Experience: Experienced
          • OS: Other
          Re: Get selective columns from a dump file using MS DOS and make new dump
          « Reply #7 on: September 08, 2014, 07:07:43 AM »
          I had found another post online about changing the DOUBLE PIPES to |#null#| and then running that output to another FOR /F command to un-delimit the output.  But then you have to run that data through another SET command to remove the #null#.  Would probably have issues with poison characters as well.

          tanvir.khan

            Topic Starter


            Rookie

            • Experience: Beginner
            • OS: Windows 7
            Re: Get selective columns from a dump file using MS DOS and make new dump
            « Reply #8 on: September 08, 2014, 08:27:03 AM »
            Thanks to both of you. It returns the correct output now but each field comes with " like below:

            "2"|"147500"|"1906342342"|"2"|"20140830125243"
            "2"|"0"|"1903312025"|"197"|"20140327111829"
            "2"|"765952"|"1903312029"|"160"|"20140327174850"

            I've tried removing " from parser file but the output become wrong again. Just need to eliminate the ". As I'm very beginner with DOS need further help.

            Another help I need - How can I call multiple files instead of a single one here

            call ParseCSV.bat "/i:|" "/o:|" <file.csv >file.csv.tmp

            i.e. call ParseCSV.bat "/i:|" "/o:|" <*.csv >file.csv.tmp ---I tried but it doesn't work

            Please help.

            Thanks/
            Tanvir

            Geek-9pm


              Mastermind
            • Geek After Dark
            • Thanked: 1026
              • Gekk9pm bnlog
            • Certifications: List
            • Computer: Specs
            • Experience: Expert
            • OS: Windows 10
            Re: Get selective columns from a dump file using MS DOS and make new dump
            « Reply #9 on: September 08, 2014, 08:42:21 AM »
            Is it OK if I ask a question?
            If this was a CSV file --
            Why not just use a spreadsheet to parse it.
            Just  asking.

            Squashman



              Specialist
            • Thanked: 134
            • Experience: Experienced
            • OS: Other
            Re: Get selective columns from a dump file using MS DOS and make new dump
            « Reply #10 on: September 08, 2014, 09:29:07 AM »
            Is it OK if I ask a question?
            I think the OP's last post answers your question.  They need to do it on multiple files.  Would you want to open up multiple files and manually process all of them on a daily basis?

            Squashman



              Specialist
            • Thanked: 134
            • Experience: Experienced
            • OS: Other
            Re: Get selective columns from a dump file using MS DOS and make new dump
            « Reply #11 on: September 08, 2014, 09:38:47 AM »
            Thanks to both of you. It returns the correct output now but each field comes with " like below:

            "2"|"147500"|"1906342342"|"2"|"20140830125243"
            "2"|"0"|"1903312025"|"197"|"20140327111829"
            "2"|"765952"|"1903312029"|"160"|"20140327174850"
            That is not the correct output. It has more than 3 fields. Nor would it be the correct input because you said you wanted columns 3, 4 & 6 and you only have 5 columns.

            I used your last data example
            Code: [Select]
            SL|RATE|DATE|ACCSTATE|PREVBALANCE|ACTTIME|USERSTARE|RATEPLAN|EXPIRY|
            2|30|20150830|55|99|20160226235959|11|3|20170101000000|
            2||20140329|33|90|20140923235959|12|4|20140328003330|
            2|20|20140722|22|91|20150118235959|11|3|20140518151446|
            And this is the output I got.
            Code: [Select]
            DATE|ACCSTATE|ACTTIME
            20150830|55|20160226235959
            20140329|33|20140923235959
            20140722|22|20150118235959
            Program is working for the 2nd time as you described your data. 

            Squashman



              Specialist
            • Thanked: 134
            • Experience: Experienced
            • OS: Other
            Re: Get selective columns from a dump file using MS DOS and make new dump
            « Reply #12 on: September 08, 2014, 09:50:42 AM »
            I used the following input which has surround quotes around each field
            Code: [Select]
            "2"|""|"147500"|"1906342342"|"2"|"20140830125243"
            "2"|""|"0"|"1903312025"|"197"|"20140327111829"
            "2"|""|"765952"|"1903312029"|"160"|"20140327174850"
            And the output does not have any surround quotes.  I did not change any of the code that Foxidrive posted.
            Code: [Select]
            147500|1906342342|20140830125243
            0|1903312025|20140327111829
            765952|1903312029|20140327174850

            tanvir.khan

              Topic Starter


              Rookie

              • Experience: Beginner
              • OS: Windows 7
              Re: Get selective columns from a dump file using MS DOS and make new dump
              « Reply #13 on: September 08, 2014, 10:01:29 AM »
              Dear Geek-9pm: Actually it's not a CSV, those are text files as I mentioned in my first post.

              Dear Squashman: No, files will come in a folder automatically with extension .txt; I've to parse those make a single dump (an auto job will run every 1 hr) and upload into a Oracle database using SQL Loader. So, I need to parse all a files of .txt extension an make a single dump.

              My Input files don't have quotes. But when I run the Parse Script the output comes with quote. These quotes appear because of this line in the parse script:

              if (c == inDelim && !quote) c='"'+outDelim+'"' ---I removed quotes from here, quotes disappear but returns wrong result 

              Did you use the same parse script foxidrive provided and also parsed from text file?

              Need further help.

              Thanks/Tanvir

              Squashman



                Specialist
              • Thanked: 134
              • Experience: Experienced
              • OS: Other
              Re: Get selective columns from a dump file using MS DOS and make new dump
              « Reply #14 on: September 08, 2014, 10:28:35 AM »
              My Input files don't have quotes. But when I run the Parse Script the output comes with quote. These quotes appear because of this line in the parse script:

              if (c == inDelim && !quote) c='"'+outDelim+'"' ---I removed quotes from here, quotes disappear but returns wrong result
              No.  It is not because of that line. 

              Did you use the same parse script foxidrive provided and also parsed from text file?
              I did not change any of the code that Foxidrive posted.