Big Mess...
So I got a 4TB external drive and I was going to just create folders such as 80GB, 160GB, 500GB, 1TB, 1.5TB and xcopy the contents from these 5 external hard drives over to this 4TB drive, but then I got thinking that on many of these drives I have lots of redundant data that is going to waste space on the 4TB, and these drives will consume up to 3.240 TB. Maybe there is a way to merge the data all in one place and have only 1 copy of each file under same file name that is the same date/time stamp, but also have those files of older dates also copied over to the 4TB.
Windows is good when copying at detecting duplicates and prompting when they are detected asking what you want to do such as replace one date/time file with another or keep both copies. BUT the bigger mess is that I have the same files scattered among various paths and only really need 1 copy of each same file name at each date/time stamp, and paths aren't important.
Looking online I found that CCleaner has a duplicate feature which I never knew about. But it requires a very manual approach to what to do to each and every duplicate found with that date/time stamp. Looking for an automatic method.
Checking here to see if there is a nifty batch or other script method that someone can point me to that will copy only 1 copy of each filename of each date/time stamp to a destination drive so that if say I have 12 files all named the same file name and of which there are 3 different date/time stamps, it will only copy the 3 different date/time stamped files of the 12 to the destination drive
no matter of the pathing.
To me if it was to be done in batch maybe it would involved a dynamic exclusion list file that has rules some how to know that a file of that name and date/time stamp has already been copied and to ignore xcopying another copy of it to the destination from from the source drive, but not sure how to build this exclusion list that would control how xcopy only copies a single instance of files which can be same file name with many date/time stamps and of non specific ( wild card ) pathing. The destination path doesnt need to be kept for duplicates so its ok if branches of the tree arent replicated from the source drives to the destination drive.
Two of my external source drives I know have heavy redundancy as for I wrote a program in a mixture of C and System Calls to Command Shell that I used to take CD-R and DVD-R data disc's and hit a key and it would create a folder named 1, 2, 3, 4 and xcopy the entire contents of the disc to the external drive at the folder iteration that it was currently at. Many backups were copied to the external drives and so there are some projects that have many same filename files and various date/time stamps. Whereas many people would be happy with the latest version of whatever file it is, I like having all files of same name of all date/time stamps so that if say I have a project that I decide I want to pick up with the source code from an earlier version of the code and code it differently, I can use that version as the base to build on from vs the newest version which may be completely reworked code which doesnt apply well to the original source type because lots would need to be chopped out and declarations removed etc to baseline it to build a different branch of code from earlier code as the best explanation of why I want all different date/time stamps of same file names.
The original source drives I am going to put into storage as archives for read-only future use and the 4TB will be a singe drive that is used as a one place to find all my data dating back to the late 1980s early 1990s with even data from 5.25" disks in there in some places from years ago when I threw away trash bags full of floppies and burned it all to a 650MB CD-R in the late 1990s to save space as well as CD-R discs hold up better to age vs magnetic tape storage methods. But not all forms of media are age proof from dying and so I have constantly backed up all my data rolling forward onto newer means of data storage as well as multiple places with the data stored so that if say that 80GB external died tomorrow, the data is likely also on the 160GB that replaced it as the main drive years ago when 160GB use to be a lot of space
Currently I am better at naming convention for my folders and projects to try to avoid like file names and add numeric version indications to projects such as file001 file002 or file09152017 to have a means of avoiding like file names as well as when wanting to find a specific file of a certain version its way easier. So my sloppiness of the past with my data I have improved upon but right now I just have a pile of data in various paths with lots of unneeded redundancy and figured I'd check here to see if anyone has a good script or ideas on how to achieve what I am trying to do.
Feeling kind of like there might be an easier method then current idea of that it would require exporting a DIR dump and have a script that then copies 1 line for each file name of each unique date/time stamp no matter of path, and then an XCOPY that runs down through that list with each line on it executed with an individual file copy to destination for only one instance if like file names of like date/time stamps, but all date/time stamps that are different copied to destination.
Or maybe even there is a tool out there other than batch and other than xcopy that can do this that I am unaware of. I am thinking there must be others out there who also have a mess of data all thrown into one place or multiple places with redundancy that is unnecessary to have say 12 files on the same drive of the same date/time stamp and only 1 needed etc.
Even if it was me copying via xcopy all data to the 4TB drive and running a batch or script that kills off duplicates of same filenames of same date/time stamp that would work too as a more simplistic approach than detecting duplicates during the copy process. So after the copying is done its basically just a redundancy garbage cleanup but all date/time stamp versions of same file name there being one of each unique time stamp of that same file name, but if the file is located in multiple paths only the first instance of that file with that date/time stamp is protected and all others of same name and date/time stamp deleted or tossed into a folder named duplicates which i can look through quickly and then delete on my own etc.