Monday, June 28, 2010

To find duplicate entries in filenames

I have used a script to convert pictures from a party to two different sizes 640 and 1024. I have kept the right resolutions of chosen pictures and deleted the rest. But as usual duplicates remain.

My ls command gives a list similar to the following.

-rwx------ 1 aaaa aaa  39191 2010-06-28 22:49 2010-IMG_4354-640.JPG
-rwx------ 1 aaaa aaa  41129 2010-06-28 22:49 2010-IMG_4355-640.JPG
-rwx------ 1 aaaa aaa  46668 2010-06-28 22:49 2010-IMG_4356-640.JPG
-rwx------ 1 aaaa aaa 121721 2010-06-28 22:49 2010-IMG_4359-1024.JPG
-rwx------ 1 aaaa aaa 104468 2010-06-28 22:49 2010-IMG_4360-1024.JPG
-rwx------ 1 aaaa aaa 164638 2010-06-28 22:50 2010-IMG_4361-1024.JPG

I would just use this list to find the duplicates. My task is greatly simplified because of the great naming scheme. Same picture with different resolution only differ by the tag 640/1024 at the end.

A single command finds our duplicates.

 ls -l | cut -d "_" -f2 | cut -d "." -f1 | sed 's/-/\t/' | cut -f1 | uniq -d

No comments:

OK GOOGLE on Samsung Galaxy S7 doesn’t work

To make Ok Google detection work on Galaxy S7 (Galaxy series phones) we need to perform a couple of steps. 1. As long as Samsung S vo...