The right way to convert PDF to text

66.8K views
9 min read

What makes PDF, invented back in 1993, a widely popular format for sharing today is its “portability”: as the name suggests, PDF (portable document format) is designed to ensure that all the formatting in your file stays intact. No matter how much you move the document around.

At the same time, the very reason that makes PDF optimal for sharing, makes it impenetrable for editing. This can be frustrating if you want to use or share excerpts from the document, update it, or machine-search for specific information in the file.

There are different ways to convert PDF to text for you to gain access to its content. Let’s look at the pros and cons of some of them.

The easiest way to convert PDF to text on Mac

The choice of the optimal instrument for converting PDF to text depends on your requirements for the output file, as well as on the situation with your input format.

For instance, you scan or take a photo of a document to digitize it — it might be saved as JPEG or PDF, and you might be looking for image enhancement options in your PDF to text converter to maximize the accuracy of text recognition in your file.

When you have a lengthy multipage PDF, you want to be able to search for specific information or use excerpts from — you don’t necessarily need to preserve formatting but need a solution that automatically converts the whole document without you grabbing screenshots of specific areas in the file. A searchable PDF or TXT/RTF will do as an output format.

If you need to modify both text and images in your PDF and are keen to keep text’s style and formatting in place, an ideal app for you should allow exporting to Word, Google Docs, or editable PDF.

Let’s start with an easy-to-use native utility — Preview.

How to use Preview to convert PDF to text

The easiest way to convert PDF to TXT on Mac is to use macOS default PDF viewer Preview. It lets you copy text from your PDFs and then work on the edits in Notes, Microsoft Word, Google Docs, or other apps of your choice.

Here’s what you need to do to convert PDF to text in Preview:

  1. Open the PDF that contains the text you want to copy (it will automatically launch Preview)
  2. Go to Tools in the top menu ➙ Text Selection 
  3. Select the portion of the text you need
  4. Go to Edit in the top menu ➙ Copy 
  5. Open the app you use for editing and paste the text you’ve copied from your PDF there

    Preview > Text Selection

As you can see, Preview is very easy to use and can quickly become your go-to choice when it comes to relatively short documents with a few images to work around. It comes up short, though, if you’re looking for tools to help you improve image quality or copy the original formatting when converting scanned PDF to text. 

How to convert scanned PDF to text with OCR

OCR (Optical Character Recognition) is a technology that recognizes text within a digital image and can extract text from PDFs.

If you’re looking for an OCR app that can help you extract data even from scans of old and damaged documents, try Prizmo.

Prizmo is an advanced PDF to text OCR scanner for Mac with the built-in image improvement options that can help you digitize your documents, receipts, or a collection of notebook sketches, as well as easily turn PDFs and image files (JPEG, RAW, TIFF, and others) into TXT, RTF, or searchable PDF documents in 26 languages. 

In Prizmo you don’t have to manually select the text you want to convert. The app automatically identifies all the areas with text and lets you edit with an in-app editing toolkit.

To convert PDF to TXT with Prizmo:

  1. Open Prizmo ➙ New Document
  2. Drop the PDF you want to turn into text into the app window
  3. Use image editing options at the bottom of the page if you need to 
  4. Click on Recognize at the top right corner of the window
  5. Go to File ➙ Export (choose a file format from the drop-down menu) ➙ Export to File
  6. Name your export file and choose where you'd like to save it

perform PDF's OCR

The only serious limitation we can think of with Prizmo is no export to Microsoft Office formats, only PDF to TXT or RTF.

How to convert PDFs to searchable text

If you’re more comfortable editing in Word or other MS Office formats after you convert scanned PDF to text, you’ll be glad you’ve found Nitro PDF Pro. 

Nitro PDF Pro is an easy to use PDF editing app that can turn OCR scans into PDFs editable in-app or Word DOCXs, keeping all the document’s layouts, fonts, and images intact. 

Follow these steps to get a searchable text document from your PDF with Nitro PDF Pro:

  1. Open Nitro PDF Pro ➙ Open File
  2. Select the PDF you want to extract text from ➙ Open
  3. Go to File in the top Nitro PDF Pro menu ➙ Export
  4. Choose language and file format from the drop-down menu ➙ Export
  5. Name your export file and choose where you'd like to save it ➙ Save

    export PDF to text file

Nitro PDF also offers a serious performance boost with all your PDF editing needs: you can add page numbers, change the order of pages or merge multiple PDFs, resize and delete images, create fillable forms and use a wide range of annotation tools.

The only concern that’s worth mentioning about Nitro PDF is that its OCR engine is cloud-based, so if you’re concerned about confidentiality, opt for AES 256-bit encryption the app offers.


How to convert any text on your screen to editable text

What’s great about OCR is that you don’t even need to convert your PDF to text if all you need is a specific line or two from your document.

Moreover, with an advanced text recognition tool like TextSniper you can instantly extract text from PDF or any other app open on your screen.

TextSniper captures text from whatever area of your screen you point to — be it graphics, PDFs, web pages, and even online videos. Using this simple selection tool is as easy as using Screenshot on your Mac: once you install TextSniper, just use its hot key combination to copy any text from your screen: 

  1. Press Command + Shift + 2 
  2. Select an area on your screen with the text you want copied — TextSniper will automatically recognize text and save it to your Mac’s clipboard
  3. Have the app’s text-to-speech feature read it to you or press Command + V to paste the extracted text where you want it 

TextSniper feels a little bit like Preview — it’s definitely not a tool for converting long heavily illustrated PDFs, but it’s ideal for capturing text from images and otherwise inaccessible formats like video or on-screen presentations.

capture text from any source

How to use Automator to convert PDF to text

Another built-in macOS app you can use as a PDF to text converter is Automator. It won’t let you convert PDF to Word and keep all the formatting, as Nitro PDF Pro would, but you can opt for rich text to save some of the formatting and rely on Automator to take care of the rest without a fail.

Here’s how you convert PDF to text with Automator:

  1. Open Automator from Applications in Finder
  2. Choose Workflow from the options in a pop-up dialogue window ➙ Choose
  3. Drag and drop your PDF into designated area of the screen 
  4. Go to Library on the top left of the screen ➙ PDFs ➙ Extract PDF Text 
  5. Drag the Extract PDF Text element into the screen on the right
  6. Choose Rich Text format
  7. Choose where you want to save your output file
  8. Click on Run in the top right corner of the screen to extract text from PDF

Extracts plain or styled text from PDF documents

Automator won’t save any images in your output file, but it’s great for long text-based PDFs and it’s already installed on your Mac.

How to OCR text in PDF in Adobe Acrobat

Since PDF as a format was invented by Adobe, it only makes sense to learn how to convert PDF to text using Adobe Acrobat.

There’s a big chance you already have Adobe Acrobat Viewer installed on your Mac, but to run OCR on a PDF you’ll need access to a paid Acrobat Pro version. It allows for text recognition in images or scanned documents, offers PDF editing and search tools, and lets you export to Word, PPT, XLS, TXT, or RTF. 

To convert scanned PDF to text, follow these steps:

  1. Open your PDF file in Acrobat
  2. Go to Tools ➙ Edit PDF
  3. Check the box next to Recognize text to make sure image-to-text editing mode is enabled under Scanned Documents
  4. Edit the text in an editable copy of your PDF you’ve just created (when you’re done, go to File in the top menu ➙ Save As)
  5. Or convert PDF to TXT or Word document for editing (choose Export PDF from the panel on the right ➙ Convert)

    convert pdf to text in Adobe Acrobat

Adobe is certainly one of the more versatile PDF to text converters: it has a great accuracy level of text recognition and lets you edit right inside of your PDF automatically matching the original file fonts. But, since it’s a full-blown Pro app, you’ll find that it’s extremely RAM-heavy and might be too technical for most use cases.

What about OCR PDFs online?

If you want to convert PDF to text using OCR online, You’ll have no problems finding loads of free text recognition services. These might be convenient for an occasional PDF to text OCR scan, but we would be cautious to recommend them for regular use. 

Apart from varying accuracy levels and problems with keeping the document formatting in place when converting to Word, the key issues with using OCR online are safety and confidentiality. 

If you pay closer attention to the privacy policies these free online OCR services declare, you’ll find that some of them openly state that they share user data with third parties and can store the uploaded files on their servers for up to seven days. 

So we recommend you use offline PDF to text converters when working with documents containing any personal or confidential info. From a cybersecurity perspective, it's also preferable to use reliable apps, like Prizmo or Nitro PDF Pro, to avoid the risk of getting your Mac infected with malware.

Convert PDF to text and break off the limitations 

Know that thanks to optical character recognition technology you’re no longer powerless when it comes to editing and searching inside PDFs.

You can use native macOS Preview to capture text from PDFs and images or more advanced TextSniper capabilities to extract text from any app active on your screen, including YouTube videos or webinar streams. For long text-based PDFs use Automator you already have as a free built-in feature on your Mac.

For more universal PDF editing and conversion toolkit opt for Prizmo, Nitro PDF Pro, or Adobe Acrobat Pro, They all offer powerful image enhancement, text editing, and a wide range of exporting options — including rich text (Prizmo), MS Office formats (Nitro PDF Pro, Adobe Acrobat Pro), and editable PDF files you can copy text to PDF format (Adobe Acrobat Pro).

Also, remember that you can make your life much easier if all you want is to be able to search PDFs for information — a high-speed AI-assisted PDF Search can handle the job without converting PDF to TXT.

Also, don’t pass on a unique chance to take Prizmo, NitroPDF Pro, PDF Search, and TextSniper for a free test run with the seven-day trial of Setapp, a platform of high-performance, best-in-class iOS and macOS apps. Learn how to convert scanned PDF to text and check out more than 240 potential options to boost your productivity.

250+ apps for $9.99
per month

Sign up to Setapp and try them for free.

Security-tested