Welcome guest. Before posting on our computer help forum, you must register. Click here it's easy and free.

Author Topic: Update CSV files  (Read 3520 times)

0 Members and 1 Guest are viewing this topic.

carmine

    Topic Starter


    Starter

    • Experience: Beginner
    • OS: Windows 7
    Update CSV files
    « on: December 04, 2014, 11:01:47 PM »
    Hi,

    There are 30 CSV files in folder and i want to update all of them, the format of files are same.

    Matching rows by comparing Col A, B, and C. If match(s) found, the duplicate rows to be deleted and their contents are appended to the original.

    .csv file.

    A|B|C|D|E|F|G|H|.........
    x|y|z|1|2|3|4|5|
    x|y|z|7|8|9|1|2|
    o|p|q|9|6|3|2|1|
    x|y|z|3|4|7|8|1|

    modified.csv
    A|B|C|D|E|F|G|H|.........
    x|y|z|1|2|3|4|5|7|8|9|1|2|3|4|7|8|1|
    o|p|q|9|6|3|2|1|

    Any ways.






    Salmon Trout



      Genius

      Thanked: 966
      • Yes
    • Computer: Specs
    • Experience: Experienced
    • OS: Other
    Re: Update CSV files
    « Reply #1 on: December 05, 2014, 04:43:52 AM »
    That is not a csv file format. The 'c' is 'csv' stands for 'comma'. Your column separator is a pipe character '|'. So please clarify.

    However...

    1. Must it be done using batch? Can VBScript be used? If not, why not?
    2. Can temp files be written somewhere?
    3. Does each column contain just 1 character? If not, please supply more representative example data.


    carmine

      Topic Starter


      Starter

      • Experience: Beginner
      • OS: Windows 7
      Re: Update CSV files
      « Reply #2 on: December 05, 2014, 06:07:51 AM »
      That is not a csv file format. The 'c' is 'csv' stands for 'comma'. Your column separator is a pipe character '|'. So please clarify.

      However...

      1. Must it be done using batch? Can VBScript be used? If not, why not?
      2. Can temp files be written somewhere?
      3. Does each column contain just 1 character? If not, please supply more representative example data.

      Thanks Salmon for reply.
      I agree its Pipe-Delimiter file exported from excel and saved as .csv
      1. Approx. there are 20000 to 25000 rows and 200 columns. I don't mine if its more faster/flexible to achieve with VBScript, because i have atleast 30 files in which continuously new data is appended and once in month i will have to update/re-arrange it.
      2. Yes we can create the temp files. c:\Temp.
      3. No, total data is numeric not a single character. Col1=2 numeric, Col2=4 numeric, Col3=1 to 7 numericals and from Col4 fixed 6 numeric.
      Sample data:
      ST|BRIN|TP|KEY1|KEY2|KEY3|KEY4|KEY5|KEY6|KEY7|KEY8|KEY9|KEY10|KEY11|KEY12|KEY13|KEY14|KEY15|KEY16|KEY17|KEY18|KEY19|KEY20|KEY21|KEY22|KEY23|KEY24|KEY25|KEY26|KEY27|KEY28|KEY29|KEY30|KEY31|KEY32|KEY33|KEY34|KEY35|KEY36|KEY37|KEY38|KEY39|KEY40|KEY41|KEY42|KEY43|KEY44|KEY45|KEY46|KEY47|KEY48|KEY49|KEY50|KEY51|KEY52|KEY53|KEY54|KEY55|KEY56|KEY57|KEY58|KEY59|KEY60|
      53|5305|1|722101|722102|722103|722104|722105|722106|722107|722108|722109|722110|722111|722112|722113|722114|722115|722116|722117|722118|722119|722120|722121|722122|722123|722124|722125|722126|722127|722128|722129|722130|722131|722132|722133|722134|722135|722136|722137|722138|722139|722140|722141|722142|722143|722144|722145|722146|722147|722148|722149|722150|
      53|5305|1|745551|745552|745553|745554|745555|745556|745557|745558|745559|745560|745561|745562|745563|745564|745565|745566|745567|745568|745569|745570|745571|745572|745573|745574|745575|745576|745577|745578|745579|745580|745581|745582|745583|745584|745585|745586|745587|745588|745589|745590|745591|745592|745593|745594|745595|745596|745597|745598|745599|745600|
      53|5305|1|823126|813002|813003|813004|813005|813006|813007|813008|813009|813010|813011|813012|813013|813014|813015|813016|813017|813018|813019|813020|813021|813022|813023|813024|813025|813026|813027|813028|813029|813030|813031|813032|813033|813034|813035|813036|813037|813038|813039|813040|813041|813042|813043|813044|813045|813046|813047|813048|813049|813050|
      37|2037|22|575196|514202|514203|514204|514205|514206|514207|514208|514209|514210|514211|514212|514213|514214|514215|514216|514217|514218|514219|514220|514221|514222|514223|514224|514225|514226|514227|514228|514229|514230|514231|514232|514233|514234|514235|514236|514237|514238|514239|514240|514241|514242|514243|514244|514245|514246|514247|514248|514249|514250|
      37|2037|22|713951|713952|713953|713954|713955|713956|713957|713958|713959|713960|713961|713962|713963|713964|713965|713966|713967|713968|713969|713970|713971|713972|713973|713974|713975|713976|713977|713978|713979|713980|713981|713982|713983|713984|713985|713986|713987|713988|713989|713990|713991|713992|713993|713994|713995|713996|713997|713998|713999|714000|

      dbenham



        Greenhorn

        Thanked: 3
        • Experience: Expert
        • OS: Windows 7
        Re: Update CSV files
        « Reply #3 on: December 05, 2014, 09:57:05 AM »
        The basic algorithm is fairly simple.

        1) sort the input descending so header remains at top.
        2) initialize PreviousKey to something that will never match
        3) for each line: (note, writes are without \n unless explicitly stated)
             - if 1st line than write line
             - else if key = previousKey write nonKey data
             - else write \n + entire line
        4) terminate the last line with \n

        This can be done easily with VBScript or JScript or PowerShell.

        Or you could use batch coupled with JREPL.BAT
        Code: [Select]
        @echo off
        sort /r %1 /o %1
        call jrepl "^((?:.*?\|){3})(.*)" "output.Write($1==prev?$2:(prev?'\r\n':'')+$0);prev=$1;false;" /jmatch /jbeg "var prev" /f %1 /o -
        echo(>>%1

        I suppose it could be done with pure batch, but I would worry about performance, as well as the potential for lines to exceed the max batch string length of ~8191 bytes.
        Dave Benham

        Salmon Trout



          Genius

          Thanked: 966
          • Yes
        • Computer: Specs
        • Experience: Experienced
        • OS: Other
        Re: Update CSV files
        « Reply #4 on: December 05, 2014, 10:26:24 AM »
        Thanks, dbenham! I was inclining towards VBScript, because of the Dictionary Object but you have ridden to the rescue!