![]() Use pdftotext command to extract text from PDF file, assuming a text layer exists. Poppler library ( ), based on Xpdf, comes with a suite of PDF tools. For a list of supported encodings run $ iconv -l The -c option discards unconvertible characters, and pointy brackets denote required options. The basic usage is $ iconv -c -f -t input.txt > output.txt Use iconv command to convert plain text from one encoding to another. For image files make sure you have ImageMagick installed, then use identify command to extract image metadata. Use file command to obtain basic metadata for most file formats. Everyone should embrace the mantra "plain text is beautiful". Distribute documents as plain text using UTF-8 encoding whenever possible. This document outlines some ideas for document conversion on Linux and Mac OS X platforms using command line tools.
0 Comments
Leave a Reply. |