Skip to content

Bump tika-core from 2.8.0 to 2.9.0

Bumps tika-core from 2.8.0 to 2.9.0.

Changelog

Sourced from tika-core's changelog.

Release 2.9.1 - ??

  • The InputStreamDigester now calculates stream length (TIKA-4016).

Release 2.9.0 - 8/23/2023

  • With user configuration, the PDFParser can now throw an EncryptedDocumentException for Microsoft IRM PDF containers with encrypted payloads. Separately, the PDFParser now throws an EncryptedDocumentException instead of an IOException if the security handler cannot be found (TIKA-4082).

  • Fix bug that led to duplicate extraction of macros from some OLE2 containers (TIKA-4116).

  • Parse iframe's srcdoc as an embedded file (TIKA-3109).

  • Add detection of warc.gz as a specialization of gz and parse as if a standard WARC (TIKA-4048).

  • Allow users to modify the attachment limit size in the /unpack resource (TIKA-4039)

  • Fixed write limit bug in RecursiveParserWrapper (TIKA-4055).

  • Add mime detection for many files with thanks to Gregory Lepore (TIKA-3992).

  • Fixed iWork 13 keynote detection on files with wrong extension (TIKA-4111).

Release 2.8.0 - 5/11/2023

  • Enable counting and/or parsing of incremental updates in PDFs. This is an experimental feature and may change in later releases (TIKA-4017).

  • Fixed bug that prevented the the loading of CompositeExternalParser in tika-app and tika-server-standard. This parser will call exiftool and ffmpeg if those are installed, as was the behavior in Tika 1.x. Exclude org.apache.tika.parser.external.CompositeExternalParser if you do not want this behavior (TIKA-4022).

  • Removed the shading of tika-parsers-standard-module (TIKA-4038).

  • Enable optional extraction of file system metadata in FileSystemFetcher (TIKA-4035).

  • Allow pretty printing in FileSystemEmitter (TIKA-4034).

  • Add detection for and a new mime type for older postscript-based Adobe Illustrator "application/illustrator+ps" files (TIKA-3971).

  • Add magic detection for canon raw file types: crw, cr2 and cr3 (TIKA-3991).

  • Add detection for ONIX message files (TIKA-4011).

  • Add detection and a parser for ActiveMime files (TIKA-3987).

... (truncated)

Commits
  • ce99af8 [maven-release-plugin] prepare release 2.9.0-rc1
  • f285d4f rat plugin fixes in prep for 2.9.0-rc1
  • 2de3c91 Merge pull request #1299 from apache/dependabot/maven/com.azure-azure-storage...
  • 68a9a71 Merge pull request #1298 from apache/dependabot/maven/aws.version-1.12.535
  • cf25c51 Bump com.azure:azure-storage-blob from 12.23.0 to 12.23.1
  • c79db30 Bump aws.version from 1.12.534 to 1.12.535
  • 7888641 Update minor version in prep for next release
  • 8c94fc5 Merge pull request #1296 from apache/dependabot/maven/test.containers.version...
  • bbc66ae Merge pull request #1297 from apache/dependabot/maven/aws.version-1.12.534
  • 292cbed Bump aws.version from 1.12.533 to 1.12.534
  • Additional commits viewable in compare view


Dependabot commands
You can trigger Dependabot actions by commenting on this MR
  • $dependabot rebase will rebase this MR
  • $dependabot recreate will recreate this MR rewriting all the manual changes and resolving conflicts

Merge request reports

Loading