1. NAME▲
pdftosrc - extract source file or stream from PDF file
2. SYNOPSIS ▲
pdftosrc PDF-file.R [ stream-object-number.R ]
3. DESCRIPTION ▲
If only PDF-fileis given as argument,
pdftosrc extracts the embedded source file from the first found stream object with /Type /SourceFile within the PDF-fileand writes it to a file with the name /SourceName as defined in that PDF stream object (see application example below). If both PDF-fileand stream-object-numberare given as arguments, and stream-object-numberis positive,
pdftosrc extracts and uncompresses the PDF stream of the object given by its stream-object-numberfrom the PDF-fileand writes it to a file named PDF-file . stream-object-number with the ending .pdfor .PDFstripped from the original PDF-filename. A special case is related to XRef object streams that are part of the PDF standard from PDF-1.5 onward: If stream-object-numberequals -1, then
pdftosrc decompresses the XRef stream from the PDF file and writes it in human-readable PDF cross-reference table format to a file named PDF-file .xref (these XRef streams can not be extracted just by giving their object number). In any case an existing file with the output file name will be overwritten.
4. OPTIONS ▲
None.
5. FILES ▲
Just the executable
R pdftosrc .
6. ENVIRONMENT ▲
None.
7. DIAGNOSTICS ▲
At success the exit code of
pdftosrc is 0, else 1. All messages go to stderr. At program invocation,
pdftosrc issues the current version number of the program
R xpdf , on which
pdftosrc is based: pdftosrc version 3.01 When
pdftosrc was successful with the output file writing, one of the following messages will be issued: Source file extracted to source-file-name or Stream object extracted to PDF-file . stream-object-number or Cross-reference table extracted to PDF-file .xref When the object given by the stream-object-numberdoes not contain a stream,
pdftosrc issues the following error message: Not a Stream object When the PDF-filecan't be opened, the error message is: Error: Couldn't open file ' PDF-file '. When
pdftosrc encounters an invalid PDF file, the error message (several lines) is: Error: May not be a PDF file (continuing anyway) (more lines) Invalid PDF file There are also more error messages from
pdftosrc for various kinds of broken PDF files.
8. NOTES ▲
An embedded source file will be written out unchanged, i. e. it will not be uncompressed in this process. Only the stream of the object will be written, i. e. not the dictionary of that object. Knowing which stream-object-numberto query requires information about the PDF file that has to be gained elsewhere, e. g. by looking into the PDF file with an editor. The stream extraction capabilities of
pdftosrc (e. g. regarding understood PDF versions and filter types) follow the capabilities of the underlying
xpdf program version. Currently the generation number of the stream object is not supported. The default value 0 (zero) is taken. The wording stream-object-numberhas nothing to do with the `object streams' introduced by the Adobe PDF Reference, 5th edition, version 1.6.
9. EXAMPLES ▲
When using pdftex, a source file can be embedded into some PDF-fileby using pdftex primitives, as illustrated by the following example: \\immediate\\pdfobj stream attr {/Type /SourceFile /SourceName (myfile.zip)} file{myfile.zip} \\pdfcatalog{/SourceObject \\the\\pdflastobj\\space 0 R} Then this zip file can be extracted from the PDF-fileby calling
pdftosrc PDF-file .
10. BUGS ▲
Not all embedded source files will be extracted, only the first found one. Email bug reports to
.
11. SEE ALSO ▲
R xpdf (1),
R pdfimages (1),
R pdftotext (1),
R pdftex (1),
12. AUTHORS ▲
pdftosrc written by Han The Thanh, using
xpdf functionality from Derek Noonburg. Man page written by Hartmut Henkel.
13. COPYRIGHT ▲
Copyright (c) 1996-2006 Han The Thanh, <> This file is part of pdfTeX. pdfTeX is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. pdfTeX is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with pdfTeX; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA