
| In this tutorial, We wil convert the document format also. We will convert pdf to text file through php and then read its content to calculate the number of characters in the file. |
SOLUTION
use pdf2html Linux Library.
Please download and install pdf2html library from http://sourceforge.net/projects/pdftohtml/
Code to execute pdf conversion and characters calculation.
Linux command execution to convert the pdf to text format.
'/usr/bin/pdftotext ' . $file_path; //File path must be the absolute server path.
PHP
shell_exec('/usr/bin/pdftotext ' . $file_path);
Complete code to upload a file to the processed folder in your root directory.
if(move_uploaded_file($_FILES[$filen]['tmp_name'],'processed/'.$_FILES[$filen]['name'])){
$file_name=$_FILES[$filen]['name'];
$file_path=$_SERVER['DOCUMENT_ROOT'].'/processed/'.$_FILES[$filen]['name'];
$file_name=str_replace('.pdf','.txt',$file_name);
$output=shell_exec('/usr/bin/pdftotext ' . $file_path);
sleep(2);
$handle = fopen($file_name, "r");
$contents = fread($handle, filesize($file_name));
fclose($handle);
$file_count = strlen(str_replace(' ','',$contents));
}
TroubleShooting
1. shell_exec function will not execute. If you don't have permission to run ssh commands
and also if your php is running in the safe mode.
2. This script will generate a text file with same name and directory where you have placed
the pdf file. So if the file isn't create in that directory and your program will work you
will able to track the file in the root directory. This means you have to correct your
file path.
3. Cannot count the calulation and upload the file. It is necessary to change the rights
of processed folder to 777.
If you have further questions about this post, kindly post your comments.

Thanks, for converting pdf to html in windows, I am using anybizsoft pdf to html converter. It support the conversion of encrypted pdfs.
ReplyDeleteThe code can work on linux and windows both. It requires php. You can use this code for a proofreading website to calculate the number characters or words to provide a quote instantly.
ReplyDelete