Jump to content


comm, rsync, Fslint, Meld - compare stuff


  • Please log in to reply
3 replies to this topic

#1 OFFLINE   sunrat

sunrat

    Thread Kahuna

  • Forum Moderators
  • 5,924 posts

Posted 31 January 2019 - 03:18 AM

I love finding new commands that make doing things simple. Today I was using rsync (without --delete) to sync my music collection from one computer to another. It worked great but I ended up with 303 directories on one side and 304 on the other. Hmmm...
I wasn't about to scroll through both directories to find the culprit and tried both Kompare and Meld but they didn't do what I wanted. Off to StackOverflow where I found the comm command. This command will give you files that are in dir1 and not in dir2:
comm -23 <(ls dir1 |sort) <(ls dir2|sort)
From man comm:

Quote

With  no  options,  produce  three-column  output.  Column one contains lines unique to FILE1, column two contains
lines unique to FILE2, and column three contains lines common to both files.

-1 suppress column 1 (lines unique to FILE1)
-2 suppress column 2 (lines unique to FILE2)
-3 suppress column 3 (lines that appear in both files)

I also inadvertently copied stacks of files that I had renamed on one side but not the other so ended up with multiple copies. For this I discovered Fslint, a GUI duplicate finding program which did a fine job. It found hundreds of files which were the same but had different names. It matches md5 sums I believe.
registered Linux user number 324659  ||    The importance of Reading The *Fine* Manual! :D
Posted ImagePosted ImagePosted ImagePosted Image
For the things we have to learn before we can do them, we learn by doing them.

#2 OFFLINE   securitybreach

securitybreach

    CLI Phreak

  • Forum Admins
  • 24,619 posts

Posted 31 January 2019 - 08:25 AM

Very cool :thumbsup:

I am pretty sure that I have used the comm command in the past but it's been a long time ago. I will have to read up on it again.
Posted ImagePosted Image
Configs/PGP Key/comhack π

"Do you begin to see, then, what kind of world we are creating? It is the exact opposite of the stupid hedonistic Utopias that the old reformers imagined. A world of fear and treachery and torment, a world of trampling and being trampled upon, a world which will grow not less but more merciless as it refines itself. Progress in our world will be progress toward more pain." -George Orwell, 1984

#3 OFFLINE   ebrke

ebrke

    Board Bigwig

  • Forum MVP
  • 2,871 posts

Posted 31 January 2019 - 06:37 PM

I'm making a note of this! I certainly should have used it when I was getting ready to install a  new openSUSE and was going to create all new partitions. I copied contents of my /home to usb but was in a hurry and very carelessly didn't check results closely and wound up loosing a lot of stuff. (Problem was copying through file manager software, not CL, and some weird error caused the process to terminate before everything had been copied). It's made me paranoid about checking my copies, but your example will make it easy.

Edited by ebrke, 31 January 2019 - 06:38 PM.


#4 OFFLINE   sunrat

sunrat

    Thread Kahuna

  • Forum Moderators
  • 5,924 posts

Posted 31 January 2019 - 07:01 PM

For completeness, I better describe what I did after this. The comm command helped in finding directories which were inside one directory but not the other. And Fslint found the files which were the same with different names. However I still had hundreds of duplicated files which were the same music but had edited tags and filenames. I ran Meld which found many of these but crashed when I was part way through checking and deleting the unwanted versions. It also took hours to finish comparing. There were over 200GB of files on each side to check through.

So back to the CLI and rsync to the rescue! Well sort of. For this I found a brilliant script at StackExchange called diff-dirs which listed every file that was different on both sides and took only seconds to do it. I still had to spend a couple of hours manually deleting files in a file manager while referring to the list but this took less time than Meld did to just check. There's probably a way of automating the deletions as well but I wanted to manually check the final result.

Here's a link to the diff-dirs script on StackExchange - https://unix.stackex...ge.com/a/463214
registered Linux user number 324659  ||    The importance of Reading The *Fine* Manual! :D
Posted ImagePosted ImagePosted ImagePosted Image
For the things we have to learn before we can do them, we learn by doing them.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users