Skip to main content

Creating djvu documents (ebooks) from scans

The best way to make ebooks would be to scan and then convert them to djvu. Not pdf because pdf is very bad when it comes to packing images. Djvu on the other hand is more optimised to handle scanned documents. It could generate a small file without losing quality of the output.

After getting all the scans in place there are a few things to do to get the final djvu document.

  • Rename all files in the following order
    book-001.jpg book-002.jpg...... book-024.jpg book-025.jpg.... book-165.jpg book-166.jpg
    It can be anything instead of book-
    But the crucial thing is that the numbers should be padded to facilitate numerical sorting otherwise if it is like 1.jpg 2.jpg ... 10.jpg 11.jpg ... then 10.jpg would occur before 2.jpg.
    We do not want that. Hence naming should be such.

    If it is already in the format : 1.jpg 2.jpg ... 10.jpg 11.jpg ... : then put 1-9 in one folder, 10-99 in another , 100-999 in another and so on. Now, go to each folder and just bulk rename them using

    rename -v 's/book-/book-00/' book-*.jpg
  • Now, collect all files at one place and convert all files to jpg.
  • Now, run the jpg -> djvu script

    The script could be found here http://www.howtoforge.com/creating_djvu_documents_on_linux

    I use a modified version suitable only for black and white scans. It is as follows -

    #!/bin/bash
    #
    # any2djvu-bw
    #

    if [ -z `which anytopnm` -o -z `which ppmtopgm` -o -z `which pgmtopbm` -o -z `which cjb2` ]; then
      usage
      echo "Error: anytopnm, ppmtopgm, pgmtopbm and cjb2 are needed"
      echo
      exit 1
    fi

    shopt -s extglob

    DEFMASK="*.jpg"
    DPI=300
    # uncomment the following line to compile a bundled DjVu document
    #OUTFILE="#0-bw.djvu"

    function usage() {
      echo
      echo "usage:"
      echo
      echo "$0 [\"REGEXP\"]"
      echo "    converts single pages with the default mask $DEFMASK (or REGEXP if provided)"
      echo "    in the current directory to single-page black and white djvu documents"
    # uncomment the following line to compile a bundled DjVu document
    # echo "    and bundles them as a djvu file $OUTFILE"
      echo
    }

    if [ -n "$1" ]; then
      MASK=$1
    else
      MASK=$DEFMASK
    fi

    for i in $MASK; do
      if [ ! -e $i ]; then
        usage
        echo "Error: current directory must contain files with the mask $MASK"
        echo
        exit 1
      fi
      if [ ! -e $i.djvu ]; then
        echo "$i"
        anytopnm $i | ppmtopgm | pgmtopbm -value 0.499 > $i.pbm
    # in netpbm >= 10.23 the above line can be replaced with the following:
    #   anytopnm $i | ppmtopgm | pamditherbw -value 0.499 > $.pbm
        cjb2 -dpi $DPI $i.pbm $i.djvu
        rm -f $i.pbm
      fi
    done

    # uncomment the following line to compile a bundled DjVu document
    #djvm -c $OUTFILE $MASK.djvu
  • The djvu joiner in the script does not work well. So use the method given in http://en.wikisource.org/wiki/Help:DjVu_files#Method_1 instead.

    djvm -c outputfile.djvu book-*.djvu
Now, we have a properly arranged ebook. Enjoy !!

Comments

Popular posts from this blog

LYRICS OF CHANDRABINDOO

___________________________________________________________________ SWEET HEART FROM AAR JAANI NAA(T-SERIES) -- SWEETHEART -- Pratham college-er din ta Aajo thik e mone poRey scene ta Dada didi haath dhorey siNRi tei bose poRey Aamar chokh ta ghorey bon bon bon bon Sweetheart, I am seating alone Sweetheart, for me there is none DhoNk gile chole gelo pratham maas Meye dekhlei feli deergho-shwash DhoNk gile chole gelo pratham maas Meye dekhlei othe nabhishwash Meyera bheeshan smart poRey chhoto mini-skirt Aamar e je sheet korey kon kon kon kon Sweetheart, I am seating alone Sweetheart, for me there is none Taarporey kete gelo maas chaar Fuse holo je kato future Bandhura purse khule eke oke taake tole Aamar pran ta korey chon mon chon mon Sweetheart, I am seating alone Sweetheart, for me there is none Ekdin lawn theke beriye Ek tanayaar dike taakiye Hawt korey ki je holo magaj ta ghurey gelo Taar kaaner saamne kori ghyan ghyan ghyan ghyan Sweetheart, I am seating alone Sweethea...

Fastest way to send multiple drafts from gmail

People claim that the fastest way to send multiple email drafts is to use Gmail IMAP with email client like Outlook or Evolution or Thunderbird. But I have found this is not true. Because Thunderbird and Evolution etc. email clients treats the drafts as emails still to be edited. So it is not just simple select all and hit send. Each email draft has to be opened and sent separately. That is a lot of clicks and mouse movements, wasting precious time and energy. I have a better solution which involves minimum keystrokes and mouse usage. Efficiency booster technique for sending emails. If someone is feeling adventurous and want to try it from the Gmail interface itself, here's how to do it in the fastest possible manner. It involves using the mouse once. Select the first draft. Gmail would open a new email box and put the cursor inside the box to write. Press TAB once to go the Send button. Press ENTER to send. Now Gmail sends it and the box is gone but the highlight goes to the last...

Changing the font size of section headings in LaTex

You have several ways to do so: 1.- A direct redefinition of \section: \makeatletter \renewcommand\section{\@startsection{section}{1}{\z@}%                                   {-3.5ex \@plus -1ex \@minus -.2ex}%                                   {2.3ex \@plus.2ex}%                                   {\normalfont\large\bfseries}} \makeatother 2.- By means of the titlesec package: \usepackage{titlesec} \titleformat{\section}{\large\bfseries}{\thesection}{1em}{} 3.- By means of the sectsty package: \usepackage{sectsty} \sectionfont{\large} source : http://www.latex-community.org/forum/viewtopic.php?f=4&t=3245   Now, I would explain the titlesec package a bit more (because it seems easier to me and...