%define langdata() \ %package %{1}\ Group: Graphics/Utilities \ Summary: %{2}%{?3: (%3)} language data for Tesseract \ Requires: %{name} >= 3.00 \ Provides: %{name}-language = %{version}-%{release} \ BuildArch: noarch \ %description %{1} \ Tesseract data files required to recognize %{?3:%3 }%{2} text. \ %files %{1} \ %{_datadir}/tessdata/%{1}.* \ %{nil} %define major 3 %define libtesseract %mklibname %{name} %{major} %define devtesseract %mklibname %{name} -d Name: tesseract Version: 3.02.02 Release: %mkrel 3 Summary: A high-performance OCR engine URL: http://code.google.com/p/%{name}-ocr/ License: Apache Group: Graphics/Utilities Source0: %{name}-ocr-%{version}.tar.gz Source1: %{name}-ocr-3.01.ara.tar.gz Source2: %{name}-ocr-3.01.eng.tar.gz Source3: %{name}-ocr-3.01.heb-com.tar.gz Source4: %{name}-ocr-3.01.heb.tar.gz Source5: %{name}-ocr-3.01.hin.tar.gz Source6: %{name}-ocr-3.01.osd.tar.gz Source7: %{name}-ocr-3.01.slk-frak.tar.gz Source8: %{name}-ocr-3.01.tha.tar.gz Source9: bul.traineddata.gz Source10: cat.traineddata.gz Source11: ces.traineddata.gz Source12: chi_sim.traineddata.gz Source13: chi_tra.traineddata.gz Source14: chr.traineddata.gz Source15: dan-frak.traineddata.gz Source16: dan.traineddata.gz Source17: deu-frak.traineddata.gz Source18: deu.traineddata.gz Source19: ell.traineddata.gz Source20: fin.traineddata.gz Source21: fra.traineddata.gz Source22: hun.traineddata.gz Source23: ind.traineddata.gz Source24: ita.traineddata.gz Source25: jpn.traineddata.gz Source26: kor.traineddata.gz Source27: lav.traineddata.gz Source28: lit.traineddata.gz Source29: nld.traineddata.gz Source30: nor.traineddata.gz Source31: pol.traineddata.gz Source32: por.traineddata.gz Source33: ron.traineddata.gz Source34: %{name}-ocr-3.02.rus.tar.gz Source35: slk.traineddata.gz Source36: slv.traineddata.gz Source37: spa.traineddata.gz Source38: srp.traineddata.gz Source39: swe-frak.traineddata.gz Source40: swe.traineddata.gz Source41: tgl.traineddata.gz Source42: tur.traineddata.gz Source43: ukr.traineddata.gz Source44: %{name}-ocr-3.02.vie.tar.gz Source45: %{name}-ocr-3.02.grc.tar.gz Patch0: tesseract-3.02.02-mga-format_security-permdawg.cpp.patch Patch1: tesseract-ocr-automake-1.13.patch BuildRequires: tiff-devel BuildRequires: jpeg-devel BuildRequires: leptonica-devel Requires: %{name}-language >= 3.0 %description The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Since then it has had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images. %files %doc AUTHORS COPYING NEWS README ReleaseNotes ChangeLog %{_bindir}/* %{_datadir}/tessdata %{_mandir}/man*/* %exclude %_datadir/tessdata/*.traineddata #----------------------------------------------------------------- %package -n %{libtesseract} Summary: %{name} support library Group: System/Libraries %description -n %{libtesseract} %{name} library. %files -n %{libtesseract} %_libdir/lib%{name}*.so.%{major}* #----------------------------------------------------------------- %package -n %{devtesseract} Summary: Development files from %{name} Group: Development/C++ Requires: %{libtesseract} = %{version}-%{release} Provides: %{devtesseract} = %{version}-%{release} Provides: %{name}-devel = %{version}-%{release} Obsoletes: %{name}-devel < %{version}-%{release} Obsoletes: %{devtesseract} < 2.04 %description -n %{devtesseract} The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Since then it has had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images. %files -n %{devtesseract} %{_includedir}/%{name} %{_libdir}/*.so %{_libdir}/pkgconfig/%{name}.pc #----------------------------------------------------------------- %package osd Group: Graphics/Utilities Summary: Orientation & script detection data pack for %{name} %description osd Data files required to recognize text orintation and scripts. %files osd %{_datadir}/tessdata/osd.* #----------------------------------------------------------------- %package heb-com Group: Graphics/Utilities Summary: Hebrew (community) language data for Tesseract Requires: %{name} >= 3.00 Provides: %{name}-language = %{version}-%{release} BuildArch: noarch %description heb-com Tesseract data files required to recognize Hebrew community text. %files heb-com %{_datadir}/tessdata/heb-* %doc tessdata/heb-.README #----------------------------------------------------------------- %langdata ara Arabic %langdata bul Bulgarian %langdata cat Catalan %langdata ces Czech %langdata chi_sim Chinese simplified %langdata chi_tra Chinese traditional %langdata chr Cherokee %langdata dan-frak Danish fraktur %langdata dan Danish %langdata deu-frak German fraktur %langdata deu German %langdata ell Greek %langdata eng English %langdata fin Finnish %langdata fra French %langdata heb Hebrew %langdata hin Hindi %langdata hun Hungarian %langdata ind Indonesian %langdata ita Italian %langdata jpn Japanese %langdata kor Korean %langdata lav Latvian %langdata lit Lithuanian %langdata nld Dutch %langdata nor Norwegian %langdata pol Polish %langdata por Portuguese %langdata ron Romanian %langdata rus Russian %langdata slk Slovakian %langdata slk-frak Slovakian fraktur %langdata slv Slovenian %langdata spa Spanish %langdata srp Serbian latin %langdata swe-frak Swedish fraktur %langdata swe Swedish %langdata tgl Tagalog %langdata tha Thai %langdata tur Turkish %langdata ukr Ukrainian %langdata vie Vietnamese %prep %setup -q -n %{name}-ocr -b1 -b2 -b3 -b4 -b5 -b6 -b7 -b8 -b34 -b44 -b45 for archive in %{SOURCE9} %{SOURCE10} %{SOURCE11} %{SOURCE12} %{SOURCE13} %{SOURCE14} %{SOURCE15} %{SOURCE16} %{SOURCE17} %{SOURCE18} %{SOURCE19} %{SOURCE20} %{SOURCE21} %{SOURCE22} %{SOURCE23} %{SOURCE24} %{SOURCE25} %{SOURCE26} %{SOURCE27} %{SOURCE28} %{SOURCE29} %{SOURCE30} %{SOURCE31} %{SOURCE32} %{SOURCE33} %{SOURCE35} %{SOURCE36} %{SOURCE37} %{SOURCE38} %{SOURCE39} %{SOURCE40} %{SOURCE41} %{SOURCE42} %{SOURCE43} do filename=`echo $archive | sed -e 's|^.*/||;s|.gz$||'` gzip -cd $archive > ./tessdata/$filename done %patch0 -p0 -b .tesseract-3.02.02-mga-format_security-permdawg.cpp.patch %patch1 -p1 -b .automake-1_13 %build ./autogen.sh %configure2_5x %make %install rm -fr %buildroot %makeinstall_std for file in tessdata/*cube.* tessdata/*.traineddata do install -m 644 -D $file %{buildroot}%{_datadir}/tessdata done rm -fr %buildroot%{_libdir}/lib%{name}*.a rm -fr %buildroot%{_libdir}/lib%{name}*.la %changelog * Mon Jan 14 2013 umeabot <umeabot> 3.02.02-3.mga3 + Revision: 384413 - Mass Rebuild - https://wiki.mageia.org/en/Feature:Mageia3MassRebuild + boklm <boklm> - Update group: Graphics/Other -> Graphics/Utilities * Sun Jan 06 2013 cjw <cjw> 3.02.02-2.mga3 + Revision: 339739 - patch1: fix build with automake 1.13 * Tue Oct 30 2012 barjac <barjac> 3.02.02-1.mga3 + Revision: 311612 - new version - removed two patches fixed upstream - created one new patch for format security error - updated several language files - removed un-needed mv and delete of tessdata files * Thu Oct 18 2012 barjac <barjac> 3.01-2.mga3 + Revision: 307809 - lang data files should be noarch * Thu Oct 18 2012 barjac <barjac> 3.01-1.mga3 + Revision: 307797 - New version - many new language data files - language file handling improved thanks to Mdv - two minor patches fixing a missing include and a format security error - group updated to new policy - two unneeded compiler error overrides removed * Fri Dec 23 2011 fwang <fwang> 3.00-4.mga2 + Revision: 186473 - rebuild for new libtiff * Tue May 24 2011 ahmad <ahmad> 3.00-3.mga1 + Revision: 100338 - Add the language support files for version 3.00 - Don't obsoletes the lang sub-packages - Make the lang sub-packages noarch - User %%configure2_5x * Fri Apr 29 2011 dmorgan <dmorgan> 3.00-2.mga1 + Revision: 93442 - Obsolete old packages * Thu Apr 14 2011 tv <tv> 3.00-1.mga1 + Revision: 85208 - new release - sync with mdv: o Fix requires in the devel package o make it build o Fix file list o Do not package .la/.a files o use configure o Remove deprecated patches * Fri Jan 21 2011 ahmad <ahmad> 2.04-6.mga1 + Revision: 29545 - imported package tesseract