Home: Perl Programming Help: Intermediate:
Need a script for small change



kuber123
New User

Sep 14, 2013, 1:14 AM


Views: 4426
Need a script for small change

<title>References:</title>
<ref id="ref0152"><mixed-citation publication-type="journal"><string-name><surname>Adermark</surname>, <given-names>L.</given-names></string-name>, <string-name><surname>Talani</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Lovinger</surname>, <given-names>D.M.</given-names></string-name>, <year>2009</year>. <article-title>Endocannabinoid-dependent plasticity at GABAergic and glutamatergic synapses in the striatum is regulated by synaptic activity</article-title>. <source>Eur J Neurosci.</source> <volume>29</volume><bold>, </bold><fpage>32</fpage>-<lpage>41</lpage>.</mixed-citation></ref>
<ref id="ref0153"><mixed-citation publication-type="journal"><string-name><surname>Allan</surname>, <given-names>A.M.</given-names></string-name>, <string-name><surname>Liang</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Luo</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Pak</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Li</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Szulwach</surname>, <given-names>K.E.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Jin</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Zhao</surname>, <given-names>X.</given-names></string-name>, <year>2008</year>. <article-title>The loss of methyl-CpG binding protein 1 leads to autism-like behavioral deficits</article-title>. <source>Hum Mol Genet.</source> <volume>17</volume><bold>,</bold> <fpage>2047</fpage>-<lpage>57</lpage>.</mixed-citation></ref>
<ref id="ref0154"><mixed-citation publication-type="other"><collab>American Psychiatric Association</collab>, <source>Diagnostic and statistical manual of mental disorders: DSM-IV-TR</source>, <year>2000</year>.</mixed-citation></ref>

Need a script for the first line.

<title>References:</title>
1. <ref id="ref0152"> Need to change as <ref id="R1>
2. <ref id="ref0153"> Need to change as <ref id="R2>

Under References Head any number can appear in ref id we need change it as sequence.

(I have attached Input XML)


Laurent_R
Veteran / Moderator

Sep 14, 2013, 1:59 AM


Views: 4422
Re: [kuber123] Need a script for small change

What have you tried so far? Did you start to code something?

Assuming your XML line is stored in the $line variable, the substitution itself is easy:

Code
$line =~ s/<ref id="ref0152">/<ref id="R1>/g;


Although I guess you probably really want the new string to be <ref id="R1"> rather than <ref id="R1> as you wrote, in which case you just have the missing double quote after the R1 in the code line above.

I leave it to you to build the other substitution, I think it should be fairly obvious.

Please show your code if you need further help.


kuber123
New User

Sep 14, 2013, 5:49 AM


Views: 4421
Re: [Laurent_R] Need a script for small change

Laurent,

Your script is applicable for the below number only. But my requirement is.

$line =~ s/<ref id="ref0152">/<ref id="R1">/g;

ref0152 is variable. it can start at any number. where ever the <title>References</title> starts, below the ref id sequence should change as R1.

One more thing is References will appear four or five times. SO every where the sequence will start at 1 like mentioned above.

Please find my input:

$xmlcont =~ s#(<ref) (id=\")ref00(\d)(\">)#\1\2F\#\4\3#gi;

In Reply To



(This post was edited by kuber123 on Sep 14, 2013, 6:08 AM)


BillKSmith
Veteran

Sep 14, 2013, 7:27 AM


Views: 4325
Re: [kuber123] Need a script for small change

If we can rely on 'ref id' being at the start of a source line, as in your example, we can use a one-liner:

Code
perl -pe's/(?<=^<ref id=")ref\d{1,4}/"R".++$i/e' source.html


Note Unix quoting. In windows, it would probably be easier to put the code in a file.
Good Luck,
Bill


Laurent_R
Veteran / Moderator

Sep 14, 2013, 12:02 PM


Views: 4319
Re: [kuber123] Need a script for small change


In Reply To
Laurent,
Your script is applicable for the below number only.


Sorry, I followed your description of the requirement. Perhaps I should have guessed that you wanted something else, but you did not say it and you did not say what.

Even with your new message, I am still not sure of what you really want. Do you just want a reference counter being incremented each time?


kuber123
New User

Sep 16, 2013, 1:02 AM


Views: 4278
Re: [Laurent_R] Need a script for small change

Laurent,

Please find my exact requirement as below,

Input:
<title>REFERENCES</title>
<ref id="ref0023"><mixed-citation publication-type="journal"><label>1.</label> <string-name><surname>Bell</surname> <given-names>BG</given-names></string-name>, <string-name><surname>Kircher</surname> <given-names>JC</given-names></string-name>, <string-name><surname>Bernhardt</surname> <given-names>PC</given-names></string-name>. <article-title>New measures improve the accuracy of the directed-lie test when detecting deception using a mock crime</article-title>. <source><italic>Physiol Behav</italic></source> <year>2008</year>; <volume><bold>94</bold></volume>: <fpage>331</fpage>–<lpage>340</lpage>.</mixed-citation></ref>
<ref id="ref0024"><mixed-citation publication-type="journal"><label>2.</label> <string-name><surname>Lykken</surname> <given-names>DT</given-names></string-name>. <article-title>Psychology and the lie detector industry</article-title>. <source><italic>Am Psychol</italic></source> <year>1974</year>; <volume><bold>29</bold></volume>: <fpage>725</fpage>–<lpage>739</lpage>.</mixed-citation></ref>

<p><bold>Development of the CAINS</bold></p>
<kwd-group><kwd>ACL</kwd><kwd>Injury</kwd><kwd>Reconstruction</kwd><kwd>Muscle</kwd><kwd>EMG</kwd></kwd-group>

<title>References</title>
<ref id="ref0051"><mixed-citation publication-type="journal"><label>1.</label> <string-name><surname>Kirkpatrick</surname> <given-names>B</given-names></string-name>, <string-name><surname>Fenton</surname> <given-names>WS</given-names></string-name>, <string-name><surname>Carpenter</surname> <given-names>WT</given-names> <suffix>Jr</suffix></string-name>, <article-title>Marder SR: The NIMH-MATRICS consensus statement on negative symptoms</article-title>. <source>Schizophr Bull</source> <year>2006</year>; <volume>32</volume>:<fpage>214</fpage>–<lpage>219</lpage></mixed-citation></ref>
<ref id="ref0052"><mixed-citation publication-type="journal"><label>2.</label> <string-name><surname>Blanchard</surname> <given-names>JJ</given-names></string-name>, <string-name><surname>Kring</surname> <given-names>AM</given-names></string-name>, <string-name><surname>Horan</surname> <given-names>WP</given-names></string-name>, <article-title>Gur RE: Toward the next generation of negative symptom assessments: the collaboration to advance negative symptom assessment in schizophrenia</article-title>. <source>Schizophr Bull</source> <year>2011</year>; <volume>37</volume>:<fpage>291</fpage>–<lpage>299</lpage> <comment>10.1093/schbul/sbq104</comment></mixed-citation></ref>




Output:
<title>REFERENCES</title>
<ref id="R1"><no>1.</no><cit type="journal"> <au><lname>Bell</lname> <fname>BG</fname></au>, <au><lname>Kircher</lname> <fname>JC</fname></au>, <au><lname>Bernhardt</lname> <fname>PC</fname></au>. <atitle>New measures improve the accuracy of the directed-lie test when detecting deception using a mock crime</atitle>. <source><I>Physiol Behav</I></source> <yr>2008</yr>; <vol><B>94</B></vol>: <fpage>331</fpage>–<lpage>340</lpage>.</cit></ref>
<ref id="R2"><no>2.</no><cit type="journal"> <au><lname>Lykken</lname> <fname>DT</fname></au>. <atitle>Psychology and the lie detector industry</atitle>. <source><I>Am Psychol</I></source> <yr>1974</yr>; <vol><B>29</B></vol>: <fpage>725</fpage>–<lpage>739</lpage>.</cit></ref>

<p><bold>Development of the CAINS</bold></p>
<kwd-group><kwd>ACL</kwd><kwd>Injury</kwd><kwd>Reconstruction</kwd><kwd>Muscle</kwd><kwd>EMG</kwd></kwd-group>

<title>References</title>
<ref id="R1"><cit type="journal"> <au><lname>Kirkpatrick</lname> <fname>B</fname></au>, <au><lname>Fenton</lname> <fname>WS</fname></au>, <au><lname>Carpenter</lname> <fname>WT</fname> <sfx>Jr</sfx></au>, <atitle>Marder SR: The NIMH-MATRICS consensus statement on negative symptoms</atitle>. <source>Schizophr Bull</source> <yr>2006</yr>; <vol>32</vol>:<fpage>214</fpage>–<lpage>219</lpage></cit></ref>
<ref id="R2"><cit type="journal"> <au><lname>Blanchard</lname> <fname>JJ</fname></au>, <au><lname>Kring</lname> <fname>AM</fname></au>, <au><lname>Horan</lname> <fname>WP</fname></au>, <atitle>Gur RE: Toward the next generation of negative symptom assessments: the collaboration to advance negative symptom assessment in schizophrenia</atitle>. <source>Schizophr Bull</source> <yr>2011</yr>; <vol>37</vol>:<fpage>291</fpage>–<lpage>299</lpage> <comment>10.1093/schbul/sbq104</comment></cit></ref>


Script Used:


my $xmlcont='';
my $input=$ARGV[0];

#print $input;
opendir (DIRC, $input) || die "canot open the file";

@chap_files=(readdir(DIRC));
@files = grep(/(.*?).xml/, @chap_files);
#print @files;
foreach $file_one(@files) {
open (CHAP, $input . "\\" . $file_one) || die "canot open the file";
open (XMLFILE, ">". $input . "\\output\\" . $file_one ."_out") || die "can't open xml file";
$xmlcont_abs='';
while (<CHAP>) {
$xmlcont=$_;
# Content Removal
$xmlcont =~ s#(<mixed-citation publication-type="book">)(<label>\d</lable>)#<\2\1>#gi;
$xmlcont =~ s#</label>#</no>#gi;
$xmlcont =~ s#<label>#<no>#gi;
$xmlcont =~ s#<string-name>#<au>#gi;
$xmlcont =~ s#</string-name>#</au>#gi;
$xmlcont =~ s#<surname>#<lname>#gi;
$xmlcont =~ s#</surname>#</lname>#gi;
$xmlcont =~ s#<given-names>#<fname>#gi;
$xmlcont =~ s#</given-names>#</fname>#gi;
$xmlcont =~ s#<(s)uf(f)i(x)>#<\1\2\3>#gi;
$xmlcont =~ s#<(a)rticle-(title)>#<\1\2>#gi;


$xmlcont_abs.=$xmlcont;



}
close (CHAP);
print XMLFILE $xmlcont_abs;
close(XMLFILE);

}


Request for Modification:
1. <ref id="ref00XX"> should change as <ref id="R1">
2. <mixed-citation publication-type="journal"><label>1.</label> need to change as "<no>1.</no><cit type="journal">". (Transpose required between these two)


Laurent_R
Veteran / Moderator

Sep 16, 2013, 10:55 AM


Views: 4266
Re: [kuber123] Need a script for small change

Still not very clear...


Quote
<ref id="ref00XX"> should change as <ref id="R1">


Maybe you want this:


Code
$line =~ s/<ref id="ref00\d\d">/<ref id="R1">/gi;


I still do not understand the second requirement. If you just want to replace one string by another, just do it. If there is more to it, then what?