forked from qpdf/qpdf
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathChangeLog
3021 lines (2240 loc) · 119 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
2020-04-29 Jay Berkenbilt <[email protected]>
* Bug fix: qpdf --check was writing errors and warnings reported
by checkLinearization to stdout instead of stderr. Fixes #438.
2020-04-09 Jay Berkenbilt <[email protected]>
* 10.0.1: release
2020-04-08 Jay Berkenbilt <[email protected]>
* Bug fix: qpdf 10.0.0 introduced a bug in which
QPDFObjectHandle::getStreamData would return the raw data when
called on an unfilterable stream instead of throwing an exception
like it's supposed to. Fixes #425.
2020-04-07 Jay Berkenbilt <[email protected]>
* Improve pdf-invert-images example to show a pattern of copying
streams into another QPDF object to enable a stream data provider
to access the original stream data.
* Fix error that caused a compilation error with clang. Fixes
#424.
2020-04-06 Jay Berkenbilt <[email protected]>
* 10.0.0: release
* Move random number generation into the crypto providers. The old
os-based secure random number generation with fallback to insecure
random number generation (only if allowed at build time) has moved
into the native crypto provider. If using other providers
(currently gnutls or openssl), random number generation will use
those libraries. The old interfaces for supplying your own random
number generator are still in place. Fixes #418.
* Source-level incompatibility: remove QUtil::srandom. There was
no reason to ever call this, and it didn't do anything unless
insecure random number generation was compiled in, which it is not
by default. If you were calling this, just remove the call because
it wasn't doing anything anyway.
* Add openssl crypto provider, contributed by Dean Scarff. This
provider is implemented using OpenSSL and also works with
BoringSSL.
2020-04-04 Jay Berkenbilt <[email protected]>
* Add a new provideStreamData method for StreamDataProvider that
allows a success code to be returned and that accepts the
suppress_warnings and will_retry methods. This makes it possible
to have a StreamDataProvider call pipeStreamData and propagate its
results back. This change allows better error handling and
recovery when objects are copied from other files and when
"immediate copy from" is enabled.
* When copying foreign streams, the same type of recovery from
streams with filtering errors is performed as when dealing with
streams in the original input. This could happen, for example, if
you are using the --pages option to take pages from another file
and that file has errors in it.
* Add a new version of QPDFObjectHandle::pipeStreamData whose
return value indicates overall success or failure rather than
whether nor not filtering was attempted. It should have always
been this way. This change was done in a backward-compatible
fashion. Previously existing pipeStreamData methods' return values
mean the same as always.
* Add "objectinfo" section to json output. In this release,
information about whether each object is a stream or not is
provided. There's otherwise no way to tell conclusively from the
json output. Over time, other computed information about objects
may be added here.
* Add new option --remove-unreferenced-resources that takes auto,
yes, or no as options. This tells qpdf whether to attempt to
remove unreferenced resources from pages when doing page splitting
operations. Prior to this change, the default was to attempt to
remove unreferenced resources, but this operation was very slow,
especially for large and complex files. The new default is "auto",
which tells qpdf to analyze the file for shared resources. This is
a relatively quick test. If no shared resources are found, then we
don't attempt to remove unreferenced resources, because
unreferenced resources never occur in files without shared
resources. To force qpdf to look for and remove unreferenced
resources, use --remove-unreferenced-resources=yes. The option
--preserve-unreferenced-resources is now a synonym for
--remove-unreferenced-resources=no.
* Use std::atomic for unique ID generation internally within the
library. This eliminates the already extremely low chance of a
collision, improves thread safety, and removes a dependency on a
random number generator. Thanks to Dean Scarff for the
contribution.
2020-04-03 Jay Berkenbilt <[email protected]>
* Allow qpdf to be built on systems without wchar_t. All "normal"
systems have wchar_t because it is part of the C++ standard, but
there are some stripped down environments that don't have it. See
README.md (search for wchar_t) for instructions and a discussion.
Fixes #406.
* Add two extra optional arguments to
QPDFPageObjectHelper::placeFormXObject to control whether the
placed item is allowed to be shrunk or expanded to fit within or
maximally fill the destination rectangle. Prior to this change,
placeFormXObject might shrink it but would never expand it.
* When calling the C API, accept any non-zero value as TRUE rather
than just 1. This appears to resolve issues on Windows when
calling some versions of the DLL directly from other languages.
2020-04-02 Jay Berkenbilt <[email protected]>
* Add method QPDFObjectHandle::unsafeShallowCopy for copying only
top-level dictionary keys or array items. See comments in
QPDFObjectHandle.hh for when this should be used.
* Remove Members class indirection for QPDFObjectHandle. Those are
copied and assigned too often, and that change caused a very
substantial performance hit.
2020-03-31 Jay Berkenbilt <[email protected]>
* When detecting unreferenced images during page splitting, if any
XObjects are form XObjects, recursively descend into them and
remove any unreferenced objects from them too. Fixes #373.
* Add QPDFObjectHandle::filterAsContents, which filters a stream's
data as if it were page contents. This can be useful to filter
form XObjects the same way we would filter page contents.
* If QPDF_EXECUTABLE is set, use it as the path to qpdf for
purposes of completion. This variable is only read during the
execution of `qpdf --completion-zsh` and `qpdf
--completion-bash`. It is not used during the actual evaluation of
completions.
2020-02-22 Jay Berkenbilt <[email protected]>
* Update pdf-set-form-values.cc to use and mention
generateAppearance, which hadn't been added when the example was
originally created.
* Detect, warn, and correct the case of /Pages in the document
catalog incorrectly pointing to a page or intermediate node
instead of the root of the pages tree. Fixes #398.
2020-01-26 Jay Berkenbilt <[email protected]>
* 9.1.1: release
* Bug fix: in qdf mode, do not write out any XRef streams that may
have appeared in the original file. These are usually
unreferenced, but with --preserve-unreferenced, they could be
written out, which breaks fix-qdf's assumption that there is at
most one XRef stream and that it appears at the end of the file.
Fixes #386.
* Bug fix: when externalizing inline images, a colorspace value
that was a lookup key in the page's /Resource -> /ColorSpace
dictionary was not properly handled. Fixes #392.
* Add "encrypt" key to the json output. This contains largely the
same information as given by --show-encryption but in a
consistent, parseable format.
* Add options --is-encrypted and --requires-password. These can be
used with files, including encrypted files with unknown passwords,
to determine whether or not a file is encrypted and whether a
password is required to open the file. The --requires-password
option can also be used to determine whether a supplied password
is correct. Information is supplied through exit codes, making
these options particularly useful for shell scripts. Fixes #390.
2020-01-14 Jay Berkenbilt <[email protected]>
* Fix for Windows being unable to acquire crypt context with a new
keyset. Thanks to Cloudmersive for the fix. Fixes #387.
* Rewrite fix-qdf in C++. This means fix-qdf is a proper
executable now, and there is no longer a runtime requirement on
perl.
* Add QUtil::call_main_from_wmain, a helper function that can be
called in the body of wmain to convert UTF-16 arguments to UTF-8
arguments and then call another main function.
2020-01-13 Jay Berkenbilt <[email protected]>
* QUtil::read_lines_from_file: add new versions that use FILE*,
use FILE* instead if std::ifstream internally to support correct
handling of Unicode filenames in Windows, and add the option to
preserve line endings.
2019-11-17 Jay Berkenbilt <[email protected]>
* 9.1.0: release
* This is the first version of qpdf that requires C++-11.
2019-11-09 Jay Berkenbilt <[email protected]>
* 9.1.rc1: release
* Improve behavior of wildcard expansion for msvc executable when
run from the Windows cmd.exe shell. Unlike in UNIX environments,
Windows leaves it up to the executable to expand its own
wildcards. Fixes #224.
* Allow :even or :odd to be appended to numeric ranges for
--pages, --rotate, and other options that take page ranges.
* When reading /P from the encryption dictionary, use static_cast
instead of QIntC to convert the value to a signed integer. The
value of /P is a bit field, and PDF files have been found in the
wild where /P is represented as an unsigned integer even though
the spec states that it is a signed 32-bit value. By using
static_cast, we allow qpdf to compensate for writers that
incorrectly represent the correct bit field as an unsigned value.
Fixes #382.
2019-11-05 Jay Berkenbilt <[email protected]>
* Add support for pluggable crypto providers, enabling multiple
implementations of the cryptographic functions needed by qpdf.
This feature was added by request of Red Hat, which recognized the
use of qpdf's native crypto implementations as a potential
security liability, preferring instead to get all crypto
functionality from a third-party library that receives a lot of
scrutiny. However it was also important to me to not impose any
unnecessary third party dependencies on my users or packagers,
some of which build qpdf for lots of environments, some of which
may not easily support gnutls. Starting in qpdf 9.1.0, it is be
possible to build qpdf with both the native and gnutls crypto
providers or with either in isolation. In support of this feature,
new classes QPDFCryptoProvider and QPDFCryptoImpl have been added
to the public interface. See QPDFCryptoImpl.hh for details about
adding your own crypto provider and QPDFCryptoProvider.hh for
details about choosing which one is used. Note that selection of
crypto providers is invisible to anyone who doesn't explicitly
care. Neither end users nor developers have to be concerned about
it.
* The environment variable QPDF_CRYPTO_PROVIDER can be used to
override qpdf's default choice of crypto provider. The
--show-crypto flag to the qpdf CLI can be used to present a list
of supported crypto providers with the default provider always
listed first.
* Add gnutls crypto provider. Thanks to Zdenek Dohnal for
contributing the code that I ultimately used in the gnutls crypto
provider and for engaging in an extended discussion about this
feature. Fixes #218.
2019-10-22 Jay Berkenbilt <[email protected]>
* Incorporate changes from Masamichi Hosoda <[email protected]>
to properly handle signature in the following ways:
- Always represent /Contents in a signature dictionary as a hex
string
- Do not compress signature dictionaries when generating object
streams
- Do not encrypt/decrypt the /Contents field of the signature
dictionary when creating or reading encrypted files
* Incorporate changes from Masamichi Hosoda <[email protected]>
to add additional methods for making it possible to gain deeper
insight into cross reference tables and object renumbering. These
new API calls make it possible for applications to go into PDF
files created by qpdf and make changes to them that go beyond
working with the PDF at the object level. The specific use case
for these changes was to write an external tool to perform digital
signature, but there could be other uses as well. New methods
include the following, all of which are described in their
respective headers:
- QPDF::getXRefTable()
- QPDFObjectHandle::getParsedOffset()
- QPDFWriter::getRenumberedObjGen(QPDFObjGen)
- QPDFWriter::getWrittenXRefTable()
2019-10-12 Jay Berkenbilt <[email protected]>
* 9.0.2: release
* Change the name of the temporary file used by --replace-input to
work with arbitrary absolute or relative paths without requiring
path splitting logic. Fixes #365.
2019-09-20 Jay Berkenbilt <[email protected]>
* 9.0.1: release
2019-09-19 Jay Berkenbilt <[email protected]>
* When converting an array to a Rectangle, ensure that llx <= urx
and lly <= ury. This prevents flatten-annotations from flipping
fields whose coordinates are messed up in the input. Fixes #363.
* Warn when duplicated dictionary keys are found during parsing.
The behavior remains as before: later keys override earlier ones.
However, this generates a warning now rather than being silently
ignored. Fixes #345.
2019-09-17 Jay Berkenbilt <[email protected]>
* Fix a few integer warnings for big-endian systems.
* QIntC tests: don't assume char is signed. Fixes #361.
2019-08-31 Jay Berkenbilt <[email protected]>
* 9.0.0: release
* Add QPDF::anyWarnings() method to find out whether there have
been any warnings without resetting the list.
* Add QPDF::closeInputSource() method to release the input source
so the input file can be deleted or renamed.
* Add methods rename_file and remove_file to QUtil.
2019-08-24 Jay Berkenbilt <[email protected]>
* Add QPDF::userPasswordMatched() and QPDF::ownerPasswordMatched()
methods so it can be determined separately whether the supplied
password matched the user password, the owner password, or both.
Fixes #159.
2019-08-23 Jay Berkenbilt <[email protected]>
* Add --recompress-streams option to qpdf and
QPDFWriter::setRecompressFlate to cause QPDFWriter to recompress
streams that are already compressed with /FlateDecode.
* Add option Pl_Flate::setCompressionLevel to globally set the
zlib compression level used by all Pl_Flate pipelines.
* Add --compression-level flag to qpdf to set the zlib compression
level. When combined with --recompress-flate, this will cause most
of qpdf's streams to use the maximum compression level. This
results in only a very small amount of savings in size that comes
at a fairly significant performance cost, but it could be useful
for archival files or other cases where every byte counts and
creation time doesn't matter so much. Note that using
--object-streams=generate in combination with these options gives
you the biggest advantage. Fixes #113.
2019-08-22 Jay Berkenbilt <[email protected]>
* In QPDFObjectHandle::ParserCallbacks, in addition to
handleObject(QPDFObjectHandle), allow developers to override
handleObject(QPDFObjectHandle, size_t offset, size_t length). If
this method appears instead, it is called with the offset of the
object in the content stream (which may be concatenated from an
array of streams) and the length of the object. Intervening
whitespace and comments are not included in offset and length.
* Add method
QPDFObjectHandle::ParserCallbacks::contentSize(size_t). If
defined, it is called by the content stream parser before the
first call to handleObject, and the argument is the total size in
bytes of the content streams.
* Add QPDFObjectHandle::isDirectNull() -- a const method that
allows determining whether an object is a literal null without
attempting to resolve it.
* Stop replacing indirect references to null with literal null in
arrays when writing output with QPDFWriter.
2019-08-19 Jay Berkenbilt <[email protected]>
* Accept (and warn for) extraneous whitespace preceding the xref
table. Fixes #341.
* Accept (and warn for) extraneous whitespace between the stream
keyword and newline. Fixes #329.
* Properly handle name tokens containing # not preceding two
hexadecimal digits. Such names are invalid in PDF >= 1.2 but valid
in PDF 1.0 and 1.1. Prior to this fix, qpdf's behavior was to
treat such tokens as an error for PDF >= 1.2, but for older PDF
tokens, the name was silently accepted, and when the name token
was written out, the # was changed to #23, which is the correct
way to represent a # character. This behavior was problematic for
several reasons: one is that, ordinarily, content streams are not
parsed, so this would cause things like image references whose
names contained # to break. Also, even if the input file was 1.0
or 1.1, there's no guarantee that the output file wouldn't be
written at a new version, resulting in invalid name tokens. The
new behavior is to issue a warning upon encountering such a token
but to accept it, regardless of the PDF version. Such tokens are
written out properly as well. Additionally, the warning message
indicates that the tokens are invalid for PDF >= 1.2. Fixes #332.
* Non-compatible API change: remove
QPDFTokenizer::allowPoundAnywhereInName(). There were a lot of
problems with this. When it was used, any name tokens read would
always be modified on output, which is never the correct behavior.
This method used to signal QPDFTokenizer to not treat # specially
in name tokens, which resulted in the incorrect behavior whose fix
is described in the preceding item.
2019-08-18 Jay Berkenbilt <[email protected]>
* When traversing the pages tree, if an invalid /Type key is
encountered, fix it. This is not done for all operations, but it
will be done for any case in which getAllPages is called. This
includes all page-based CLI operations. (Hopefully) Fixes #349.
2019-08-17 Jay Berkenbilt <[email protected]>
* Change internal implementation of QPDF arrays to use sparse
arrays, which results in using much less memory for arrays with
large numbers of nulls. Various files have been encountered in the
wild that contains thousands of arrays with millions of nulls.
Fixes #305, #311.
2019-07-03 Jay Berkenbilt <[email protected]>
* Non-compatible API change: change
QPDFOutlineDocumentHelper::getTopLevelOutlines and
QPDFOutlineObjectHelper::getKids to return a std::vector instead
of a std::list of QPDFOutlineObjectHelper objects. This is to work
around bugs with some compilers' STL implementations that are
choking with list here. There's no deep reason for these to be
lists instead of vectors. Fixes #297.
2019-06-22 Jay Berkenbilt <[email protected]>
* Handle encrypted files with missing or invalid /Length entries
in the encryption dictionary.
* QPDFWriter: allow calling set*EncryptionParameters before
calling setFilename. Fixes #336.
* It now works to run --completion-bash and --completion-zsh when
qpdf is started from an AppImage.
* Provided a more useful error message when Windows can't get
security context. Thanks to user zdenop for supplying some code.
Fixes #286.
* Favor PointerHolder over manual memory allocation in shippable
code where possible. Fixes #235.
* If pkg-config is available, use it to local libjpeg and zlib. If
not, fall back to old behavior. Fixes #324.
* The "make install" target explicitly sets a mode rather than
relying the user's umask. Fixes #326.
* When a file has linearization warnings but no errors, qpdf
--check and --check-linearization now exit with code 3 instead
of 2. Fixes #50.
* Add new function QUtil::read_file_into_memory.
2019-06-21 Jay Berkenbilt <[email protected]>
* When supported, qpdf builds with -fvisibility=hidden, which
removes non-exported symbols from the shared library in a manner
similar to how Windows DLLs work. This is better for performance
and also better for safety and protection of private interfaces.
See https://gcc.gnu.org/wiki/Visibility. *NOTE*: If you are
getting linker errors trying to catch exceptions or derive things
from a base class in the qpdf library, it's possible that a
QPDF_DLL_CLASS declaration is missing somewhere. Please report
this as a bug at https://github.com/qpdf/qpdf/issues.
* Source-level incompatibility: remove the version
QPDF::copyForeignObject with an unused boolean parameter. If you
were, for some reason, calling this, just take the parameter away.
* Source-level incompatibility: remove the version
QPDFTokenizer::expectInlineImage with no arguments. It didn't
produce correct inline images. This is a very low-level routine.
There is little reason to call it outside of qpdf's lexical
engine.
* Source-level incompatibility: rename QUtil::strcasecmp to
QUtil::str_compare_nocase. This is a non-compatible change, but
QUtil::strcasecmp is hardly the most important part of qpdf's API.
The reason for this change is that strcasecmp is a macro on some
systems, and that was causing problems when QUtil.hh was included
in certain circumstances. Fixes #242.
2019-06-20 Jay Berkenbilt <[email protected]>
* Enable compilation with additional warnings for integer
conversion and sign (-Wsign-conversion, -Wconversion for gcc and
similar; -W3 for msvc) if supported. These warnings are on by
default can be turned off by passing --disable-int-warnings
* Fix all integer sign and conversion warnings. This makes all
integer type conversions that have potential data loss explicit
with calls that do range checks and raise an exception.
* Change out_bufsize argument to Pl_Flate's constructor for int to
unsigned int for compatibility with underlying zlib
implementation.
* Change QPDFObjectHandle::pipeStreamData's encode_flags argument
from unsigned long to int since int is the underlying type of the
enumerated type values that are passed to it. This change should
be invisible to virtually all code unless you are compiling with
strict warning flags and explicitly casting to unsigned long.
* Add methods to QPDFObjectHandle to return the value of Integer
objects as int and unsigned int with range checking and fallback
behavior to avoid silent underflow/overflow conditions.
* Add functions to QUtil to convert unsigned integers to strings,
avoiding implicit conversion between unsigned and signed integer
types.
* Add QIntC.hh, containing integer type converters that do range
checking.
2019-06-18 Jay Berkenbilt <[email protected]>
* Remove previously submitted qpdf_read_memory_fuzzer as it is a
small subset of qpdf_fuzzer.
2019-06-15 Jay Berkenbilt <[email protected]>
* Update CI (Azure Pipelines) to run tests with some sanitizers.
* Do "ideal integration" with oss-fuzz. This includes adding a
better fuzzer with a seed corpus and adding automated tests of the
fuzzer with the test data.
* When parsing files, while reading an object, if there are too
many consecutive errors without enough intervening successes, give
up on the specific object. This reduces cases in which very badly
damaged files send qpdf into a tail spin reading one character at
a time and reporting warnings.
2019-06-13 Jay Berkenbilt <[email protected]>
* Perform initial integration of Google's oss-fuzz project by
copying the fuzzer someone from Google already did into the qpdf
repository and adding build support. This shift in control is in
preparation for an ideal integration with oss-fuzz.
2019-06-09 Jay Berkenbilt <[email protected]>
* When /DecodeParms is an empty list, ignore it on read and delete
it on write. Fixes #331.
2019-05-18 Jay Berkenbilt <[email protected]>
* 8.4.2: release
2019-05-16 Jay Berkenbilt <[email protected]>
* Fix memory error in Windows-only code from typo. Fixes #330.
2019-04-27 Jay Berkenbilt <[email protected]>
* 8.4.1: release
2019-04-20 Jay Berkenbilt <[email protected]>
* When qpdf --version is run, it will detect if the qpdf CLI was
built with a different version of qpdf than the library. This
usually indicates that multiple versions of qpdf are installed and
that the library path is not set up properly. This situation
sometimes causes confusing behavior for users who are not actually
running the version of qpdf they think they are running.
* Add parameter --remove-page-labels to remove page labels from
output. In qpdf 8.3.0, the behavior changed so that page labels
were preserved when merging and splitting files. Some users were
relying on the fact that if you ran qpdf --empty --pages ... all
page labels were dropped. This option makes it possible to get
that behavior if it is explicitly desired. Fixes #317.
* Add parameter --keep-files-open-threshold to override the
maximum number of files that qpdf will allow to be kept open at
once. Fixes #288.
* Handle Unicode characters in filenames properly on Windows. The
changes to support Unicode on the CLI in Windows broke Unicode
filenames on that platform. Fixes #298.
* Slightly tighten logic that determines whether an object is a
page. The previous logic was sometimes failing to preserve
annotations because they were passing the overly loose test for
whether something was a page. This fix has a slight risk of
causing some extraneous objects to be copied during page splitting
and merging for erroneous PDF files whose page objects contain
invalid types or are missing the /Type key entirely, both of which
would be invalid according to the PDF specification.
* Revert change that included preservation of outlines (bookmarks)
in --split-pages. The way it was implemented caused a very
significant performance penalty when splitting pages with
outlines. We need a better solution that only copies the relevant
items, not the whole tree.
2019-03-11 Jay Berkenbilt <[email protected]>
* JSON serialization: add missing leading 0 to decimal values
between -1 and 1. Fixes #308.
2019-02-01 Jay Berkenbilt <[email protected]>
* 8.4.0: release
2019-01-31 Jay Berkenbilt <[email protected]>
* Bug fix: do better pre-checks on images before optimizing;
refuse to optimize images that can't be converted to JPEG because
of colorspace or depth.
* Add new options --externalize-inline-images, which converts
inline images larger than a specified size to regular images, and
--ii-min-bytes, which tweaks that size.
* When optimizing images, inline images are now included in the
optimization, first being converted to regular images. Use
--keep-inline-images to exclude them from optimization. Fixes #278.
* Add method QPDFPageObjectHelper::externalizeInlineImages, which
converts inline images whose size is at least a specified amount
to regular images.
* Remove traces of acroread, which hasn't been available in Linux
for a long time.
2019-01-30 Jay Berkenbilt <[email protected]>
* Do not include space after ID operator in inline image data. The
token now correctly contains the image data, the EI operator,
and the delimiter that precedes the EI operator.
* Improve locating of an inline image's EI operator to correctly
handle the case of EI appearing inside the image data.
* Very low-level QPDFTokenizer API now includes an
expectInlineImage method that takes an input stream, enabling it
to locate an inline image's EI operator better. When this method
is called, the inline image token returned will not contain the EI
operator and will contain correct image data. This is called
automatically everywhere within the qpdf library. Most user code
will never have to use the low-level tokenizer API. If you use
Pl_QPDFTokenizer, this will be done automatically for you. If you
use the low-level API and call expectInlineImage, you should call
the new version.
2019-01-29 Jay Berkenbilt <[email protected]>
* Bug fix: when returning an inline image token, the tokenizer no
longer includes the delimiter that follows EI. The
QPDFObjectHandle created from the token was correct.
* Handle files with direct page objects, which is not allowed by
the PDF spec but has been seen in the wild. Fixes #164.
2019-01-28 Jay Berkenbilt <[email protected]>
* Bug fix: when using --stream-data=compress, object streams and
xref streams were not compressed. They were compressed if no
--stream-data option was specified. Fixes #271.
* When linearizing or getting the list of all pages in a file,
replace duplicated page objects with a shallow copy of the page
object. Linearization and all page manipulation APIs require page
objects to be unique. Pages that were originally duplicated will
still share contents and any other indirect resources. Fixes #268.
2019-01-26 Jay Berkenbilt <[email protected]>
* Add --overlay and --underlay options. Fixes #207.
* Create examples/pdf-overlay-page.cc to demonstrate use of
page/form XObject interaction
* Add new methods QPDFPageObjectHelper::getFormXObjectForPage,
which creates a form XObject equivalent to a page, and
QPDFObjectHandle::placeFormXObject, which generates content stream
code to placing a form XObject on a page.
2019-01-25 Jay Berkenbilt <[email protected]>
* Add new method QPDFObjectHandle::getUniqueResourceName() to
return an unused key available to be used in a resource
dictionary.
* Add new method QPDFPageObjectHelper::getAttribute() that
properly handles inherited attributes and allows for creation of a
copy of shared attributes. This is very useful if you are getting
an attribute of a page dictionary with the intent to modify it
privately for that page.
* Fix QPDFPageObjectHelper::getPageImages (and the legacy
QPDFObjectHandle::getPageImages()) to properly handle images in
inherited resources dictionaries.
2019-01-20 Jay Berkenbilt <[email protected]>
* Tweak the content code generated for variable text fields to
better handle font sizes and multi-line text.
* When generating appearance streams for variable text
annotations, properly handle the cases of there being no
appearance dictionary, no appearance stream, or an appearance
stream with no BMC..EMC marker.
* When flattening annotations, remove annotations from the file
that don't have appearance streams. These were previously being
preserved, but since they are invisible, there is no reason to
preserve them when flattening annotations.
2019-01-19 Jay Berkenbilt <[email protected]>
* NOTE: qpdf CLI: some non-compatible changes were made to how
qpdf interprets password arguments that contain Unicode characters
that fall outside of ASCII. On Windows, the non-compatibility was
unavoidable, as explained in the release notes. On all platforms,
it is possible to get the old behavior if desired, though the old
behavior would almost always result in files that other
applications were unable to open. As it stands, qpdf should now be
able to open passwords encrypted with a wide range of passwords
that some other viewers might not handle, though even now, qpdf's
Unicode password handling is not 100% complete.
* Add --password-mode option, which allows fine-grained control of
how password arguments are treated. This is discussed fully in the
manual. Fixes #215.
* Add option --suppress-password-recovery to disable the behavior
of searching for a correct password by re-encoding the provided
password. This option can be useful if you want to ensure you know
exactly what password is being used.
2019-01-17 Jay Berkenbilt <[email protected]>
* When attempting to open an encrypted file with a password, if
the password doesn't work, try alternative passwords created by
re-interpreting the supplied password with different string
encodings. This makes qpdf able to recover passwords with
non-ASCII characters when either the decryption or encryption
operation was performed with an incorrectly encoded password.
* Fix data loss bug: qpdf was discarding referenced resources in
the case in which a page's resource dictionary contained an
indirect reference for either /Font or /XObject that contained
fonts or XObjects not referenced on all pages that shared the
resource. This was a "typo" in the code. The comment explained the
correct behavior, and the code was clearly intended to handle this
issue, but the implementation had an error in it. This is fixed by
a single-line change, which can be found in git commit
4bc434000c42a7191e705c8a38216ca6743ad9ff. That commit can be used
as a patch that applies cleanly against qpdf 8.1.0 and forward.
The bug was introduced in version 8.1.0. For the record, this is
the first bug in qpdf's history that could result in silent loss
of data when processing a correct input file. Fixes #276.
2019-01-15 Jay Berkenbilt <[email protected]>
* Add QUtil::possible_repaired_encodings which, given a string,
generates other strings that represent re-interpretation of the
bytes in a different coding system. This is used to help recover
passwords if the password string was improperly encoded on a
different system due to user error or a software bug.
2019-01-14 Jay Berkenbilt <[email protected]>
* Add new CLI flags to 128-bit and 256-bit encryption: --assemble,
--annotate, --form, and --modify-other to control encryption
permissions with more granularity than was allowed with the
--modify flag. Fixes #214.
* Add new versions of
QPDFWriter::setR{3,4,5,6}EncryptionParameters that allow
individual setting of the various permission bits. The old
interfaces are retained for backward compatibility. In the "C"
API, add qpdf_set_r{3,4,5,6}_encryption_parameters2. The new
interfaces use separate booleans for various permissions instead
of the qpdf_r3_modify_e enumerated type, which set permission bits
in predefined groups.
* Add versions of utf8 to single-byte character transcoders that
return a success code.
2019-01-13 Jay Berkenbilt <[email protected]>
* Add several more string transcoding and analysis methods to
QUtil for bidirectional conversion between PDF Doc, Win Ansi, Mac
Roman, UTF-6, and UTF-16 along with detection of valid UTF-8 and
UTF-16.
2019-01-12 Jay Berkenbilt <[email protected]>
* In the --pages option, allow the same page to be specified more
than once. You can now do "--pages A.pdf 1,1 --" or
"--pages A.pdf 1 A.pdf 1" instead of having to use two different
paths to specify A.pdf. Fixes #272.
* Add QPDFPageObjectHelper::shallowCopyPage(). This method creates
a new page object that is a "shallow copy" of the given page as
described in the comments in QPDFPageObjectHelper. The resulting
object has not been added anywhere but is ready to be passed to
QPDFPageDocumentHelper::addPage of its own QPDF or another QPDF
object.
* Add QPDF::getUniqueId() method to return an identifier that is
intended to be unique within the scope of all QPDF objects created
by the calling application in a single run.
* In --pages, allow "." as a replacement for the current input
file, making it possible to say "qpdf A.pdf --pages . 1-3 --"
instead of having to repeat the input filename.
2019-01-10 Jay Berkenbilt <[email protected]>
* Add new configure option --enable-avoid-windows-handle, which
causes the symbol AVOID_WINDOWS_HANDLE to be defined. If set, we
avoid using Windows I/O HANDLE, which is disallowed in some
versions of the Windows SDK, such as for Windows phones.
QUtil::same_file will always return false in this case. Only
applies to Windows builds.
* Add new method QPDF::setImmediateCopyFrom. When called on a
source QPDF object, streams can be copied FROM that object to
other ones without having to keep the source QPDF or its input
source around. The cost is copying the streams into RAM. See
comments in QPDF.hh for setImmediateCopyFrom for a detailed
explanation.
2019-01-07 Jay Berkenbilt <[email protected]>
* 8.3.0: release
* Add sample completion files in completions. These can be used by
packagers to install on the system wherever bash and zsh keep
their vendor-supplied completions.
* Add configure flag --enable-check-autofiles, which is on by
default. Packagers whose packaging systems automatically refresh
autoconf or libtool files should pass --disable-check-autofiles to
./configure to suppress warnings about automatically generated
files being outdated.
2019-01-06 Jay Berkenbilt <[email protected]>
* Remove the restriction in most cases that the source QPDF used
in a copyForeignObject call has to stick around until the
destination QPDF is written. The exceptional case is when the
source stream gets is data using a
QPDFObjectHandle::StreamDataProvider. For a more in-depth
discussion, see comments around copyForeignObject in QPDF.hh.
Fixes #219.
2019-01-05 Jay Berkenbilt <[email protected]>
* When generating appearances, if the font uses one of the
standard, built-in encodings, restrict the character set to that
rather than just to ASCII. This will allow most appearances to
contain characters from the ISO-Latin-1 range plus a few
additional characters.
* Add methods QUtil::utf8_to_win_ansi and
QUtil::utf8_to_mac_roman.
* Add method QUtil::utf8_to_utf16.
2019-01-04 Jay Berkenbilt <[email protected]>
* Add new option --optimize-images, which recompresses every image
using DCT (JPEG) compression as long as the image is not already
compressed with lossy compression and recompressing the image
reduces its size. The additional options --oi-min-width,
--oi-min-height, and --oi-min-area prevent recompression of images
whose width, height, or pixel area (width * height) are below a
specified threshold.
* Add new option --collate. When specified, the semantics of
--pages change from concatenation to collation. See the manual for
a more detailed discussion. Fixes #259.
* Add new method QPDFWriter::getFinalVersion, which returns the
PDF version that will ultimately be written to the final file. See
comments in QPDFWriter.hh for some restrictions on its use. Fixes
#266.
* When unexpected errors are found while checking linearization
data, print an error message instead of calling assert, which
cause the program to crash. Fixes #209, #231.
* Detect and recover from dangling references. If a PDF file
contained an indirect reference to a non-existent object (which is
valid), when adding a new object to the file, it was possible for
the new object to take the object ID of the dangling reference,
thereby causing the dangling reference to point to the new object.
This case is now prevented. Fixes #240.
2019-01-03 Jay Berkenbilt <[email protected]>
* Add --generate-appearances flag to the qpdf command-line tool to
trigger generation of appearance streams.
* Fix behavior of form field value setting to handle the following
cases:
- Strings are always written as UTF-16
- Check boxes and radio buttons are handled properly with
synchronization of values and appearance states
* Define constants in qpdf/Constants.h for interpretation of
annotation and form field flags
* Add QPDFAnnotationObjectHelper::getFlags
* Add many new methods to QPDFFormFieldObjectHelper for querying
flags and field types
* Add new methods for appearance stream generation. See comments
in QPDFFormFieldObjectHelper.hh for generateAppearance() for a
description of limitations.
- QPDFAcroFormDocumentHelper::generateAppearancesIfNeeded
- QPDFFormFieldObjectHelper::generateAppearance
* Bug fix: when writing form field values, always write string
values encoded as UTF-16.
* Add method QUtil::utf8_to_ascii, which returns an ASCII string
for a UTF-8 string, replacing out-of-range characters with a
specified substitute.
2019-01-02 Jay Berkenbilt <[email protected]>
* Add method QPDFObjectHandle::getResourceNames that returns a set
of strings representing all second-level keys in a dictionary
(i.e. all keys of all direct dictionary members).
2018-12-31 Jay Berkenbilt <[email protected]>
* Add --flatten-annotations flag to the qpdf command-line tool for
annotation flattening.
* Add methods for flattening form fields and annotations:
- QPDFPageDocumentHelper::flattenAnnotations - integrate
annotation appearance streams into page contents with special
handling for form fields: if appearance streams are up to date
(/NeedAppearances is false in /AcroForm), the /AcroForm key of
the document catalog is removed. Otherwise, a warning is
issued, and form fields are ignored. Non-form-field
annotations are always flattened if an appearance stream can
be found.
- QPDFAnnotationObjectHelper::getPageContentForAppearance -
generate the content stream fragment to render an appearance
stream in a page's content stream as a form xobject. Called by
flattenAnnotations.
* Add method QPDFObjectHandle::mergeResources(), which merges
resource dictionaries. See detailed description in
QPDFObjectHandle.hh.
* Add QPDFObjectHandle::Matrix, similar to
QPDFObjectHandle::Rectangle, as a convenience class for
six-element arrays that are used as matrices.
2018-12-23 Jay Berkenbilt <[email protected]>
* When specifying @arg on the command line, if the file "arg" does
not exist, just treat this is a normal argument. This makes it
easier to deal with files whose names start with the @ character.
Fixes #265.
* Tweak completion so it works with zsh as well using
bashcompinit.
2018-12-22 Jay Berkenbilt <[email protected]>
* Add new options --json, --json-key, and --json-object to
generate a json representation of the PDF file. This is described
in more depth in the manual. You can also run qpdf --json-help to
get a description of the json format.
2018-12-21 Jay Berkenbilt <[email protected]>
* Allow --show-object=trailer for showing the document trailer.
* You can now use eval $(qpdf --completion-bash) to enable bash
completion for qpdf. It's not perfect, but it works pretty well.
2018-12-19 Jay Berkenbilt <[email protected]>
* When splitting pages using --split-pages, the outlines
dictionary and some supporting metadata are copied into the split
files. The result is that all bookmarks from the original file
appear, and those that point to pages that are preserved work
while those that point to pages that are not preserved don't do
anything. This is an interim step toward proper support for