tag:blogger.com,1999:blog-18157064.post36190179825391969..comments2024-03-18T02:14:57.204-07:00Comments on Google Operating System: Google Adds OCR for PDF Files and ImagesAlex Chituhttp://www.blogger.com/profile/02618542750965508582noreply@blogger.comBlogger32125tag:blogger.com,1999:blog-18157064.post-47889948309075423382018-02-05T09:17:30.129-08:002018-02-05T09:17:30.129-08:00yes this is really helpful content. please check m...yes this is really helpful content. please check my site .. my website is free ocr online tool..http://getlatestthings.com/Anonymoushttps://www.blogger.com/profile/07051384745829033101noreply@blogger.comtag:blogger.com,1999:blog-18157064.post-76618902918064619752012-05-25T13:56:21.499-07:002012-05-25T13:56:21.499-07:00Wonder how good Google OCR is - I uploaded an 88 p...Wonder how good Google OCR is - I uploaded an 88 page patent file and it seems I can seach down to page 10. Haven't looked at the fidelity or anything yet. But bummer, seems to stop after page 10... I'll wait and see if time will convert more - but doubt.TCGhttps://www.blogger.com/profile/04392493204600214596noreply@blogger.comtag:blogger.com,1999:blog-18157064.post-45645283163412802142011-11-01T08:36:50.085-07:002011-11-01T08:36:50.085-07:00OCR is important too, but as long as "fuzzy s...OCR is important too, but as long as "fuzzy search" works to a point where you can find the keywords you search on... then that would be a good feature as well. I know it doesn't help those looking to convert documents to text, but I am specifically look at archiving documents and being able to search on them when the need arises.tekgemshttps://www.blogger.com/profile/07428585855516396442noreply@blogger.comtag:blogger.com,1999:blog-18157064.post-22655229158669132862011-01-24T06:09:05.891-08:002011-01-24T06:09:05.891-08:00I think that http://www.e-ocr.com resolve this pro...I think that http://www.e-ocr.com resolve this problem.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-22726145662579055362011-01-13T16:51:56.044-08:002011-01-13T16:51:56.044-08:00WOW! I was officially sock-less after I did this: ...WOW! I was officially sock-less after I did this: <br />1) Scanned a 2-page paper doc (some plain text, some texty images) at 300dpi to 'blind' PDF [800kb]. Blind meaning no searchable text, just an image.<br />2) Input PDF to free-online-ocr (FOO).<br />3) Output as PDF [1800kb](yes, it sounds silly, thus it was the last combo I tried)<br />4) It looked exactly like the original. Searched for words that FOO and Google Docs (GDX) had trouble converting to text formats in previous combos. No trouble. As far as I can tell EVERY word in this PDF (although not in the images in the PDF) was searchable. This is where I lost my sox.<br />5A) Uploaded to GDX without conversion to GDX format. Every word remained searchable.<br />5B) Uploaded to GDX WITH conversion to GDX format...all the trouble words are messed up again.<br /><br />Moving forward I'll convert 'blind' pdfs to searchable pdfs via FOO then upload to GDX for searchable storage.Bortnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-42543404001742254282011-01-13T15:41:53.062-08:002011-01-13T15:41:53.062-08:00Besides Google OCR, I prefer to use other free onl...Besides Google OCR, I prefer to use other free online OCR services. I highly recommend the free beta software offered by Ricoh Innovations at: http://beta.rii.ricoh.com/betalabs/content/document-conversionNathttps://www.blogger.com/profile/06354769397765956900noreply@blogger.comtag:blogger.com,1999:blog-18157064.post-53981102948422583482011-01-13T11:19:44.089-08:002011-01-13T11:19:44.089-08:00This is really useful for us students who dont wan...This is really useful for us students who dont want to print the pdf lecture slides onto paper that the professor has posted, and would like to type the additional notes.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-36212977603000901412010-09-29T09:04:48.035-07:002010-09-29T09:04:48.035-07:00Tried a TON of OCR software, basically anything th...Tried a TON of OCR software, basically anything that was freely available as a trial or shareware... result is most are barely functional when trying to recover a ton of VBasic code I lost due to computer issue but that I had a printed hard copy of. I did however find a bright light, OCRkit. I found it to be amazingly accurate in preserving the formatting, text, complex equation formats and all small characters. I know its not google, but its almost free compared to the cost of other commercial software (i.e. 50 bucks vs 200-500 for some). If you need this, try it. I am VERY curious about what OCR engine they use and how it is possible that its SO Much better than other presumably polished legacy versions that have been around for years, yet this is a very new software in its second revision....Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-46334768614773481572010-09-17T07:13:09.984-07:002010-09-17T07:13:09.984-07:00http://free-online-ocr.com/ <---good find.http://free-online-ocr.com/ <---good find.UFChttp://ufcnews.comnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-79351150720606491612010-09-05T15:08:42.979-07:002010-09-05T15:08:42.979-07:00How much did Adobe pay you?How much did Adobe pay you?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-10604878111877219832010-06-30T08:40:13.483-07:002010-06-30T08:40:13.483-07:00This is a great feature, moreover most of the good...This is a great feature, moreover most of the good OCR tools are sharewares. But some good free OCR tools to Convert Scanned images to Text / Word Documents also available as listed at <br />http://www.globinch.com/2010/06/08/best-pdf-ocr-tools-to-convert-scanned-images-to-text-word-documents/Globinchhttp://www.globinch.com/noreply@blogger.comtag:blogger.com,1999:blog-18157064.post-86070343407774609822010-06-24T15:29:01.802-07:002010-06-24T15:29:01.802-07:00Me parece fatástica la nueva caracteristica añadid...Me parece fatástica la nueva caracteristica añadida a google docs. Esta gente si que sabe aportar valor añadido a sus productos. Que sigan así.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-70608163493768399962010-06-24T05:30:46.232-07:002010-06-24T05:30:46.232-07:00Does not work. Does not even tell why.
Just says &...Does not work. Does not even tell why.<br />Just says "couldn not upload". <br />Now what ?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-56854088977429554332010-06-23T18:19:57.349-07:002010-06-23T18:19:57.349-07:00Make no mistake, no OCR on this planet can guarant...Make no mistake, no OCR on this planet can guarantee 100% correct conversions all the time, including Google's latest entry. This leads to frustration and disbelief. No one tells you that including Google. <br />(Unless of course your documents are picture perfect all the time). <br /><br />Which means you must proofread every comma, fulstop and every character converted - so if you PDF is 75 pages it will be a pain & you will take ages. <br /><br />Unless its supported by manual prrofreading & line editing OCR may not always work for every one all the time!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-38594512377927703592010-06-22T22:39:55.954-07:002010-06-22T22:39:55.954-07:00Did not work for me either. I tried with a 74-page...Did not work for me either. I tried with a 74-page PDF but the converted doc had only the first couple of pages. It goes nuts when there are tables too.Danushkahttp://danushka-menikkumbura.blogspot.com/noreply@blogger.comtag:blogger.com,1999:blog-18157064.post-56846989322560544052010-06-22T18:37:10.083-07:002010-06-22T18:37:10.083-07:00PS Resolution is not an issue:
Resolution is 1...PS Resolution is not an issue: <br /> Resolution is 15-20 pixels per character height.<br /><br />Anonymous said...<br /><br /> Unfortunately, I have to give a thumbs down, way down.<br /><br /> Google OCR fails completely (resulting document is empty or contains a single fax number) in contrast with Acrobat 7.0 which completes OCR of image of text that is a mixture of Greek and english and contains email addresses (with latin characters). Acrobat gets all the email addresses (my goal, here).<br /> June 22, 2010 6:31 PMAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-79345440811176760762010-06-22T18:31:37.584-07:002010-06-22T18:31:37.584-07:00Unfortunately, I have to give a thumbs down, way d...Unfortunately, I have to give a thumbs down, way down.<br /><br />Google OCR fails completely (resulting document is empty or contains a single fax number) in contrast with Acrobat 7.0 which completes OCR of image of text that is a mixture of Greek and english and contains email addresses (with latin characters). Acrobat gets all the email addresses (my goal, here).Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-48141152561282849982010-06-22T13:30:55.365-07:002010-06-22T13:30:55.365-07:00Will there be an API for this functionality?Will there be an API for this functionality?ndaversahttps://www.blogger.com/profile/03082877325925036031noreply@blogger.comtag:blogger.com,1999:blog-18157064.post-13186861592684557012010-06-22T12:28:07.194-07:002010-06-22T12:28:07.194-07:00Does anyone know if the OCR being incorporated in ...Does anyone know if the OCR being incorporated in Google Docs is the Tesseract engine. I would guess that it is, but I hate to guess.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-8567309571293680452010-06-22T11:53:12.747-07:002010-06-22T11:53:12.747-07:00Same question here. Why dont they use the google n...Same question here. Why dont they use the google n-grams to correct the phrases?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-14680702203771537222010-06-22T11:42:59.107-07:002010-06-22T11:42:59.107-07:00hat leider nicht so gut funktioniert: https://docs...hat leider nicht so gut funktioniert: https://docs.google.com/leaf?id=0B8__AElz7h5oODhmMTA1ZjAtYWYxYy00MWQ0LTk1OTItNzVkODdkZTAxYzc3&hl=decoanihttp://multimedia-maniac.comnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-38082223954150916762010-06-22T09:57:41.753-07:002010-06-22T09:57:41.753-07:00super ... sowas habe ich mir eigentlich von sharep...super ... sowas habe ich mir eigentlich von sharepoint erwartet ... analog zu onenote. damit könnte man sehr gut teilenummern von bilder zu fahrzeugteilen extrahieren und in autmatisiert in eine datenbank schreiben ... :)coanihttp://multimedia-maniac.comnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-65725591207453213912010-06-22T09:53:52.481-07:002010-06-22T09:53:52.481-07:00I am using the same OCR library to build qiqqa.com...I am using the same OCR library to build qiqqa.com and can testify to the fact that it is an interesting technical problem to know if you should "coerce" the OCR output you see to their nearest English words - especially in scientific literature. Sometimes it makes it worse! But yeah, hopefully frequent triplets like making a diiference" should be reliably alterable!<br /><br />I love to see this progress though!<br />JimmeJimme Jardinehttp://www.qiqqa.comnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-18185233231712767282010-06-22T06:59:36.034-07:002010-06-22T06:59:36.034-07:00http://code.google.com/p/ocropus/
been waiting for...http://code.google.com/p/ocropus/<br />been waiting for this to hitAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-18157064.post-80148444593817529592010-06-22T06:52:23.135-07:002010-06-22T06:52:23.135-07:00Any idea if it supports Japanese kanji?Any idea if it supports Japanese kanji?Anonymousnoreply@blogger.com