Monday, June 28, 2010

To find duplicate entries in filenames

I have used a script to convert pictures from a party to two different sizes 640 and 1024. I have kept the right resolutions of chosen pictures and deleted the rest. But as usual duplicates remain.

My ls command gives a list similar to the following.

-rwx------ 1 aaaa aaa  39191 2010-06-28 22:49 2010-IMG_4354-640.JPG
-rwx------ 1 aaaa aaa  41129 2010-06-28 22:49 2010-IMG_4355-640.JPG
-rwx------ 1 aaaa aaa  46668 2010-06-28 22:49 2010-IMG_4356-640.JPG
-rwx------ 1 aaaa aaa 121721 2010-06-28 22:49 2010-IMG_4359-1024.JPG
-rwx------ 1 aaaa aaa 104468 2010-06-28 22:49 2010-IMG_4360-1024.JPG
-rwx------ 1 aaaa aaa 164638 2010-06-28 22:50 2010-IMG_4361-1024.JPG

I would just use this list to find the duplicates. My task is greatly simplified because of the great naming scheme. Same picture with different resolution only differ by the tag 640/1024 at the end.

A single command finds our duplicates.

 ls -l | cut -d "_" -f2 | cut -d "." -f1 | sed 's/-/\t/' | cut -f1 | uniq -d

No comments:

Removing audio noise from a video

The basic steps are to extract the audio, remove noise, and then re-insert the noise-free audio back into the video wrapper. Step I: Ext...