Welcome guest. Before posting on our computer help forum, you must register. Click here it's easy and free.

Author Topic: Record Manipulation Across 2 Files with AWK  (Read 3829 times)

0 Members and 1 Guest are viewing this topic.

eldorado

    Topic Starter


    Greenhorn

    Record Manipulation Across 2 Files with AWK
    « on: February 10, 2010, 10:33:06 PM »
    Hi there,

    I would like to do a search across 2 files and concatenate matching line by using field 1 in both file as key.
    If there is a duplicated key in file A, I want the matching line from file B to be concatenated to both the lines in file A.

    Input File_A:

    301|1234|NBS|
    302|2000|ABS|
    303|3600|ACC|
    304|2800|ABS|
    305|2880|ACC|
    306|3100|ACC|
    306|3200|BCC|
    307|2030|NBS|
    308|3600|NBS|


    Input File_B:

    301|AF_ACC_BDL|774;AF_WMG_BDL|6660;AF_REP_BDL|4735;
    303|AF_ACC_BDL|1360;
    304|AF_ACC_BDL|5092;
    305|AF_ACC_BDL|32384;
    306|AF_ACC_BDL|80859;AF_REP_BDL|80875;
    308|AF_ACC_BDL|1418;AF_WLH_BDL|78273;AF_WSL_BDL|9990;AF_WFL_BDL|9998;

    Expected output:

    301|1234|NBS|AF_ACC_BDL|774;AF_WMG_BDL|6660;AF_REP_BDL|4735;
    302|2000|ABS|
    303|3600|ACC|AF_ACC_BDL|1360;
    304|2800|ABS|AF_ACC_BDL|5092;
    305|2880|ACC|AF_ACC_BDL|32384;
    306|3100|ACC|AF_ACC_BDL|80859;AF_REP_BDL|80875;
    306|3200|BCC|AF_ACC_BDL|80859;AF_REP_BDL|80875;
    307|2030|NBS|
    308|3600|NBS|AF_ACC_BDL|1418;AF_WLH_BDL|78273;AF_WSL_BDL|9990;AF_WFL_BDL|9998;

    My code:

    #!/bin/bash

    awk -F"|" 'BEGIN {
    while ((getline < "FILE_A") > 0)
       {a[$1] = $1 FS $2 FS $3 FS}
    }
    {
       FS = "|"
       if ($1 in a)
       {
          print(a[$1]substr($0, 5))
          delete a[$1]
       }
       else
          print $0
    }
    END {
       for (i in a)
       {
          print a
       }
    }' FILE_B > OUTFILE

    My current OUTFILE:

    301|1234|NBS|AF_ACC_BDL|774;AF_WMG_BDL|6660;AF_REP_BDL|4735;
    303|3600|ACC|AF_ACC_BDL|1360;
    304|2800|ABS|AF_ACC_BDL|5092;
    305|2880|ACC|AF_ACC_BDL|32384;
    306|3200|BCC|AF_ACC_BDL|80859;AF_REP_BDL|80875;
    308|3600|NBS|AF_ACC_BDL|1418;AF_WLH_BDL|78273;AF_WSL_BDL|9990;AF_WFL_BDL|9998;
    302|2000|ABS|
    307|2030|NBS|

    Note that: 306|3100|ACC|AF_ACC_BDL|80859;AF_REP_BDL|80875;
    has not been printed.


    Any help in this is greatly appreciated. Many thanks.