CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Help in formatting a Perl script for creating a concordance of Perso-Arabic

 



ardibehest
New User

Aug 8, 2015, 6:22 PM

Post #1 of 1 (1381 views)
Help in formatting a Perl script for creating a concordance of Perso-Arabic Can't Post

 have written a Perl Script to handle and identify syllables in Sindhi written in Perso-Arabic Script. I need this for training eventually a converter to convert Sindhi in Perso-Arabic to Sindhi in Devanagari script.
DETAILS
The script invokes 2 files:
1. Syllables: A list of all the syllables.
2. Corpus: A list of words in Arabic script followed by their Devanagari equivalent, delimited by =
e.g.

Code
अंग=اَنگ 
अंगणु=اَنگَڻُ
अंगनि=اَنگن
अंगल=اَنگَلَ

EXPECTED FORMAT
In each case the output is supposed to spew out
a. The syllable in question whether it is Initial Medial or Final.
b. At least 6 to 10 examples (at present only one is spewed out)
c. Bells and whistles a frequency count of all the words [not present in my script: don't know how to tailor two sets of counts]
In other words the output should be as under:
SYLLABLE: FREQUENCY
Initial : 6 EXAMPLES
Medial 6 EXAMPLES
Final 6 EXAMPLES
Standalone 6 EXAMPLES
If there are none or less, then it should specify the same.
It does work to a certain extent but the following major problems are there

PROBLEMS
1.The script should address only the Perso-Arabic side using the = delimiter and ignore the Devanagari side. It does not do that as a result of which all final occurrences are not shown. I don't know how to instruct the program to delimit analysis only to the Arabic side of the corpus and ignore the rest
2. I need at least 6-10 instances of tokens from the corpus file. At present only one is given
3. If possible the frequency.should be provided: [ I don't know how to tailor two sets of counts]
I have racked my brains over this and all attempts to get this type of output have failed.
I am attaching the script and also the data files. Could you please help me out.
Many thanks for your help
p.s. the preview shows that the text data is not shown. But the attachment provides the sample data
Attachments: data n script.zip (192 KB)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives