Amazing-Python-Scripts

Форк
0

README.md

Text_Extract

forthebadge made-with-python

Text extraction form Images, OCR, Tesseract, Basic Image manipulation are all important yet very basic scripts.

This script uses pytesseract for text extraction from images, considering it only recognizes text and can only print it, this script additionally adds a functionality to write the text in a txt and/or csv file.

Setup instructions

  • Setup a python 3.x virtual environment.
  • Activate the environment
  • Install the dependencies using pip3 install -r requirements.txt
  • You are all set and the script is Ready to run.
  • Carefully follow the Instructions.

Further Readings

Some newcomers for the first time struggle with Tesseract, this is a direct link to the installer

Setting up OCR can be found here

PATH env variable can help in optimizing the code. This and this link will help you in order to achieve that.

Usage

Just make sure that Tesseract is in proper directory, run the code according the comments and guidelines.

Smaple - 
Enter the Folder name containing Images: <Name of Folder>
Enter your desired output location: <Name of Folder>

Output

Output

Output

Image containing Text

Before Compression

After Extraction

After Backup

Author(s)

Made by Vybhav Chaturvedi

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.