PDF Connector
Search and extract data from PDF documents
Actions
Extract All Text
Extracts all the text from the specified PDF document and returns it as an array of texts.
| Input | Comments | Default |
|---|---|---|
| PDF data | This must refer to a buffer containing the raw bytes of a PDF. |
Extract Page
Returns the specified page in the given PDF document as a new separate PDF document.
| Input | Comments | Default |
|---|---|---|
| PDF data | This must refer to a buffer containing the raw bytes of a PDF. | |
| Page Number | This specifies a page in a PDF document by number. |
Extract Page Text
Locates and extracts pages text from the specified PDF document that matches the specified page range.
| Input | Comments | Default |
|---|---|---|
| PDF data | This must refer to a buffer containing the raw bytes of a PDF. | |
| Page Start | This specifies the starting page to extract from the PDF document. | |
| Page End | This specifies the ending page to extract from the PDF document. If not defined, will only extract the page on the pageStart input. |
Extract Text by Pattern
Extracts text from the specified PDF document that matches the search text.
| Input | Comments | Default |
|---|---|---|
| PDF data | This must refer to a buffer containing the raw bytes of a PDF. | |
| Search Pattern | This is the text to search for in the PDF document. | |
| Characters After | This specifies the number of characters to extract from the PDF document after the search pattern found. If not specified, the entire page is returned. | |
| Case Sensitive | This specifies whether searching should be case-sensitive. You can choose true or false. | false |
Find Pattern
Searches the PDF document and returns a list of page numbers containing text that satisfies the search criteria.
| Input | Comments | Default |
|---|---|---|
| PDF data | This must refer to a buffer containing the raw bytes of a PDF. | |
| Search Pattern | This is the pattern to search for in the PDF document. | |
| Case Sensitive | This specifies whether searching should be case-sensitive. You can choose true or false. | false |
| Use Regex | This specifies whether the search pattern is a regular expression. | false |
| Contains? | This specifies whether to return page numbers that either contain or don't contain the search pattern. Options are true or false | true |
Page Numbers
Returns a sequence of page numbers for the specified PDF document, from 1 to the last page number.
| Input | Comments | Default |
|---|---|---|
| PDF data | This must refer to a buffer containing the raw bytes of a PDF. |