I cannot extract the text from an element using ElementTree - python-3.x

A snippet of my document and the code is as follows:
import xml.etree.ElementTree as ET
obj = ET.fromstring("""
<tab>
<infos><bounds left="7947" top="88607" width="10086" height="1184" bottom="89790" right="18032" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="true" mbFramePrintAreaValid="true"/> <prtBounds left="115" top="0" width="9300" height="1169" bottom="1168" right="9414"/> </infos>
<row > <infos> <bounds left="8062" top="88607" width="9300" height="524" bottom="89130" right="17361" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="true" mbFramePrintAreaValid="true"/> <prtBounds left="0" top="0" width="9300" height="524" bottom="523" right="9299"/> </infos>
<cell ptr="000002232E644270" id="199" symbol="class SwCellFrame" next="202" upper="198" lower="200" rowspan="1"> <infos> <bounds left="8062" top="88607" width="546" height="524" bottom="89130" right="8607" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="true" mbFramePrintAreaValid="true"/> <prtBounds left="7" top="15" width="532" height="509" bottom="523" right="538"/> </infos>
<txt> <infos> <bounds left="8069" top="88622" width="532" height="187" bottom="88808" right="8600" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="false" mbFramePrintAreaValid="true"/> <prtBounds left="0" top="3" width="532" height="184" bottom="186" right="531"/> </infos>
<Finish/>
</txt>
<txt> <infos> <bounds left="8069" top="88809" width="532" height="149" bottom="88957" right="8600" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="false" mbFramePrintAreaValid="true"/> <prtBounds left="136" top="0" width="396" height="149" bottom="148" right="531"/> </infos>
UDA <Finish/>
</txt>
</cell>
<cell ptr="000002232E642E40" id="202" symbol="class SwCellFrame" next="205" prev="199" upper="198" lower="203" rowspan="1"> <infos> <bounds left="8608" top="88607" width="3283" height="524" bottom="89130" right="11890" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="true" mbFramePrintAreaValid="true"/> <prtBounds left="7" top="15" width="3269" height="509" bottom="523" right="3275"/> </infos>
<txt>
<infos> <bounds left="8615" top="88622" width="3269" height="180" bottom="88801" right="11883" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="false" mbFramePrintAreaValid="true"/> <prtBounds left="0" top="7" width="3269" height="173" bottom="179" right="3268"/> </infos> <Finish/>
</txt>
<txt> <infos> <bounds left="8615" top="88802" width="3269" height="149" bottom="88950" right="11883" mbFixSize="false" mbFrameAreaPositionValid="true" mbFrameAreaSizeValid="false" mbFramePrintAreaValid="true"/> <prtBounds left="58" top="0" width="3170" height="149" bottom="148" right="3227"/> </infos>
Nombre <Finish/>
</txt>
</cell>
</row>
</tab>
""")
a = obj.findall('./row/cell/txt')
for i, item in enumerate(a):
print(i, item.text.strip())
But if I simplify the document, I do manage to extract the text,
obj = ET.fromstring("""
<tab>
<row>
<cell >
<txt > <Finish/> </txt>
<txt > UDA <Finish/> </txt>
</cell>
<cell >
<txt > <Finish/> </txt>
<txt > Nombre <Finish/> </txt>
</cell>
</row>
</tab>
""")
a = obj.findall('./row/cell/txt')
for i, item in enumerate(a):
print(i, item.text.strip())
0
1 UDA
2
3 Nombre
I don't know how to solve this problem, because my working document is very large and I can't simplify it as I have done in this example.

The "UDA" and "Nombre" strings are found in the tail of infos elements. The easiest way to get the wanted output is to use itertext():
a = obj.findall('./row/cell/txt')
for i, item in enumerate(a):
text = "".join([s.strip() for s in item.itertext()])
print(i, text)

Related

Extract all the elements within a class in python

I have made the following, in order to extract the first element in a class:
if var_source == "Image":
outcsvfile = 'Image_Ids' + file + '_' + timestamp +'.csv'
with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow(['ax','physical_id'])
for i in range(len(var_ax)):
browser.get('https://test.com' + str(mpid) + '&ax=' + var_ax[i])
self.master.update()
self.status.config(text = str(i+1) + "/" + str(len(var_ax)) + " Extracting AX: " + var_ax[i])
try:
ph_id = browser.find_element_by_xpath("//div[contains(#class, 'a-image-wrapper')]").get_attribute("alt")
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i],ph_id])
except:
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i],'[missing AX]'])
I have 2 questions:
How can I extract all the physical_ids in the same cell separated by a comma (cell B2 = "physical_id1, physical_id2, physical_id3")?
How can I sum the number of physical_ids exported in column C (ex: for C2 we will have 3, because in B2 we have 3 physical_ids exported)?
The source code:
<div alt="51d5gBEzhjL" style="width:220px;float:left;margin-left:34px;margin-bottom:10px;border:1px solid #D0D0D0" class="a-image-wrapper a-lazy-loaded MAIN GLOBAL 51d5gBEzhjL"><h1 class="a-size-medium a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold">MAIN</h1><h1 class="a-size-base a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold"> ou GLOBAL / Merch 1</h1></div>
<h1 class="a-size-medium a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold">FACT</h1>
<h1 class="a-size-base a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold"> ou GLOBAL / Merch 1</h1>
<span class="a-declarative" data-action="a-modal"><center><img class="ecx" id="51S+wTs36zL" src="https://test.com/images/I/51S+wTs36zL._AA200_.jpg" alt="51S+wTs36zL"></center></span>
<center>
<img class="ecx" id="51S+wTs36zL" src="https://test.com/images/I/51S+wTs36zL._AA200_.jpg" alt="51S+wTs36zL">
</center>
</span>
<h5 class="physical-id">51S+wTs36zL</h5>
<h1 class="a-size-medium a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold" style="background:#D0D0D0">UPLOADED</h1>
<h1 class="a-size-base a-spacing-mini a-spacing-top-mini a-color-information a-text-center a-text-bold">19/Apr/2016:17:45:40</h1>
</div>
This worked for me and resolved both my questions:
if var_source == "Image":
outcsvfile = 'Image_Ids-' + file + '_' + timestamp +'.csv'
with open(outcsvfile, 'w', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow(['ax','physical_id','image_count'])
for i in range(len(var_ax)):
browser.get('https://test.com' + str(mpid) + '&ax=' + var_ax[i])
self.master.update()
self.status.config(text = str(i+1) + "/" + str(len(var_ax)) + " Extracting AX: " + var_ax[i])
try:
ph_id = browser.find_element_by_xpath("//div[contains(#class, 'a-image-wrapper')]").get_attribute("alt")
ids1 = browser.find_elements_by_class_name("physical-id")
ids1Text = []
for a in ids1:
ids1Text.append(a.text)
nr = str(len(ids1))
ax = ', '.join(ids1Text)
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i], ax, nr])
except:
print(i+1,': extract AX:',var_ax[i])
with open(outcsvfile, 'a+', encoding='utf-8', newline='') as csvfile:
csv_writer = csv.writer(csvfile)
csv_writer.writerow([var_ax[i],'[missing AX]'])

Extract tag content from XML using beautifulsoup

Thanks to the many threads I found here, I managed to do some of what I wanted to do. But now I'm stuck. Help would be appreciated.
So I have this XML file of a few thousand records, from which I want to extract
The contents of tag 520 (URL)
The contents of tag 001 (recno) wherever a tag 520 was found
--> So the result should be a list of URLs + recnos.
Bonus points for helping me export the subsequent result to a csv instead of showing it onscreen ;)
# Import BeautifulSoup
from bs4 import BeautifulSoup as bs
content = []
# Read the XML file
with open("snippet_bilzen.xml", "r") as file:
# Read each line in the file, readlines() returns a list of lines
content = file.readlines()
# Combine the lines in the list into a string
content = "".join(content)
bs_content = bs(content, "lxml")
#Get contents of tag 520
rows_url = bs_content.find_all(tag="520")
for row in rows_url: # Print all occurrences
print(row.get_text())
# trying to get contents of tag 001 where 520 occurs
rows_id = bs_content.find_all(tag="001")
for row in rows_id:
print(row.get_text())
This is an piece of the xml :
<record>
<leader>00983nam a2200000 c 4500</leader>
<controlfield tag="001">c:obg:160033</controlfield>
<controlfield tag="005">20180605143926.1</controlfield>
<controlfield tag="008">060214s1987 xx u und </controlfield>
<datafield ind1="3" ind2=" " tag="024">
<subfield code="a">0075992557726</subfield>
</datafield>
<datafield ind1="1" ind2="0" tag="245">
<subfield code="a">Sign 'O' the times</subfield>
</datafield>
<datafield ind1="#" ind2="#" tag="260">
<subfield code="b">Paisley Park</subfield>
<subfield code="c">1987</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="300">
<subfield code="a">2 cd's</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="306">
<subfield code="a">01:19:51</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="340">
<subfield code="a">cd</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="500">
<subfield code="a">Met teksten</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="520">
<subfield code="a">ill</subfield>
<subfield code="u">http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.95.131.jpg.rm99991231.51210.17208</subfield>
</datafield>
</record>
<record>
<leader>00854nam a2200000 c 4500</leader>
<controlfield tag="001">c:obg:157417</controlfield>
<controlfield tag="005">20180725100810.1</controlfield>
<controlfield tag="008">060214s1984 xx u und </controlfield>
<datafield ind1="3" ind2=" " tag="024">
<subfield code="a">0042282289827</subfield>
</datafield>
<datafield ind1="3" ind2=" " tag="024">
<subfield code="a">4007196101944</subfield>
</datafield>
<datafield ind1="2" ind2=" " tag="024">
<subfield code="a">JKX0823</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="028">
<subfield code="a">IMCD 236/822 898-2</subfield>
</datafield>
<datafield ind1="1" ind2="3" tag="245">
<subfield code="a">The unforgettable fire</subfield>
</datafield>
<datafield ind1="#" ind2="#" tag="260">
<subfield code="b">Island Records</subfield>
<subfield code="c">1984</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="300">
<subfield code="a">1 cd</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="306">
<subfield code="a">00:42:48</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="340">
<subfield code="a">cd</subfield>
</datafield>
<datafield ind1=" " ind2=" " tag="520">
<subfield code="a">ill</subfield>
<subfield code="u">http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.31.88.jpg.rm99991231.19959.13742</subfield>
</datafield>
</record>
Try this.
from simplified_scrapy import SimplifiedDoc,req,utils
html = utils.getFileContent('snippet_bilzen.xml')
doc = SimplifiedDoc(html)
rows_url = doc.selects('#tag=520').select('#code=u').text
rows_id = doc.selects('#tag=001').text
print (rows_url)
print (rows_id)
Result:
['http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.95.131.jpg.rm99991231.51210.17208', 'http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.31.88.jpg.rm99991231.19959.13742']
['c:obg:160033', 'c:obg:157417']
If I understand you right, you want to get data only from records where elements with tag="520" and tag="001" are present:
from bs4 import BeautifulSoup
with open('snippet_bilzen.xml', 'r') as f_in:
soup = BeautifulSoup(f_in.read(), 'html.parser')
data = []
for record in soup.select('record:has([tag="520"] > [code="u"]):has([tag="001"])'):
tag_520 = record.select_one('[tag="520"] > [code="u"]') # select URL
tag_001 = record.select_one('[tag="001"]') # select tag="001"
data.append([tag_520.get_text(strip=True), tag_001.get_text(strip=True)])
print(data)
Prints:
[['http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.95.131.jpg.rm99991231.51210.17208', 'c:obg:160033'],
['http://geapbib001.cipal.be/docman/docman.phtml?file=authorities.87.31.88.jpg.rm99991231.19959.13742', 'c:obg:157417']]

BeautifulSoup - find_all div tags with different class name

I want to select all <div> where class name is either post has-profile bg2 OR post has-profile bg1 but not last one i.e. panel
<div id="6" class="post has-profile bg2"> some text 1 </div>
<div id="7" class="post has-profile bg1"> some text 2 </div>
<div id="8" class="post has-profile bg2"> some text 3 </div>
<div id="9" class="post has-profile bg1"> some text 4 </div>
<div class="panel bg1" id="abc"> ... </div>
select() is matching only single occurrence. I'm trying it with find_all(), but bs4 is not able to find it.
if soup.find(class_ = re.compile(r"post has-profile [bg1|bg2]")):
posts = soup.find_all(class_ = re.compile(r"post has-profile [bg1|bg2]"))
How to solve it with regex and without regex? Thanks.
You can use builtin CSS selector within BeautifulSoup:
data = """<div id="6" class="post has-profile bg2"> some text 1 </div>
<div id="7" class="post has-profile bg1"> some text 2 </div>
<div id="8" class="post has-profile bg2"> some text 3 </div>
<div id="9" class="post has-profile bg1"> some text 4 </div>
<div class="panel bg1" id="abc"> ... </div>"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'lxml')
divs = soup.select('div.post.has-profile.bg2, div.post.has-profile.bg1')
for div in divs:
print(div)
print('-' * 80)
Prints:
<div class="post has-profile bg2" id="6"> some text 1 </div>
--------------------------------------------------------------------------------
<div class="post has-profile bg2" id="8"> some text 3 </div>
--------------------------------------------------------------------------------
<div class="post has-profile bg1" id="7"> some text 2 </div>
--------------------------------------------------------------------------------
<div class="post has-profile bg1" id="9"> some text 4 </div>
--------------------------------------------------------------------------------
The 'div.post.has-profile.bg2, div.post.has-profile.bg1' selector selects all <div> tags with class "post hast-profile bg2" and all <div> tags with class "post hast-profile bg1".
You can define a function that describes the tags of interest:
def test_tag(tag):
return tag.name=='div' \
and tag.has_attr('class') \
and "post" in tag['class'] \
and "has-profile" in tag['class'] \
and ("bg1" in tag['class'] or "bg2" in tag['class']) \
and "panel" not in tag['class']
And apply the function to the "soup":
soup.findAll(test_tag)
Using Regex.
Try:
from bs4 import BeautifulSoup
import re
s = """<div id="6" class="post has-profile bg2"> some text 1 </div>
<div id="7" class="post has-profile bg1"> some text 2 </div>
<div id="8" class="post has-profile bg2"> some text 3 </div>
<div id="9" class="post has-profile bg1"> some text 4 </div>
<div class="panel bg1" id="abc"> ... </div>"""
soup = BeautifulSoup(s, "html.parser")
for i in soup.find_all("div", class_=re.compile(r"post has-profile bg(1|2)")):
print(i)
Output:
<div class="post has-profile bg2" id="6"> some text 1 </div>
<div class="post has-profile bg1" id="7"> some text 2 </div>
<div class="post has-profile bg2" id="8"> some text 3 </div>
<div class="post has-profile bg1" id="9"> some text 4 </div>

Rotate SVG without misplacing the image

I have this svg "code"
<svg xmlns="http://www.w3.org/2000/svg" width="20" height="34">
<path transform="rotate(0 10 17)" fill="#6699FF" d="M 0.06,25.03 C 0.06,25.03 4.06,20.97 4.06,20.97 4.06,20.97 3.97,5.06 3.97,5.06 3.97,5.06 9.94,0.03 9.94,0.03 9.94,0.03 15.78,5.09 15.78,5.09 15.78,5.09 16.03,21.06 16.03,21.06 16.03,21.06 19.94,25.00 19.94,25.00 19.94,25.00 19.97,30.09 19.97,30.09 19.97,30.09 10.06,20.97 10.06,20.97 10.06,20.97 10.05,23.00 10.05,23.00 10.05,23.00 19.91,32.03 19.91,32.03 19.91,32.03 19.94,33.97 19.94,33.97 19.94,33.97 17.97,33.97 17.97,33.97 17.97,33.97 10.03,26.94 10.03,26.94 10.03,26.94 10.05,29.01 10.05,29.01 10.05,29.01 15.94,34.00 15.94,34.00 15.94,34.00 12.00,33.94 12.00,33.94 12.00,33.94 10.00,32.00 10.00,32.00 10.00,32.00 8.00,34.03 8.00,34.03 8.00,34.03 4.00,34.00 4.00,34.00 4.00,34.00 9.97,29.01 9.97,29.01 9.97,29.01 9.97,26.94 9.97,26.94 9.97,26.94 2.00,33.96 2.00,33.96 2.00,33.96 0.00,33.98 0.00,33.98 0.00,33.98 0.02,32.00 0.02,32.00 0.02,32.00 9.96,23.00 9.96,23.00 9.96,23.00 9.95,20.98 9.95,20.98 9.95,20.98 0.00,30.00 0.00,30.00 0.00,30.00 0.06,25.03 0.06,25.03 Z" />
<path transform="rotate(0 10 17) translate(0 0)" fill="white" d="M 8.04,17.04 C 8.04,17.04 7.98,11.04 7.98,11.04 7.98,11.04 5.00,11.00 5.00,11.00 5.00,11.00 10.00,6.04 10.00,6.04 10.00,6.04 14.96,11.02 14.96,11.02 14.96,11.02 12.02,11.04 12.02,11.04 12.02,11.04 12.02,17.04 12.02,17.04 12.02,17.04 8.04,17.04 8.04,17.04 Z" />
</svg>
its a vessel with a arrow inside.
I am trying to rotate this "picture", (i need to rotate it on 360 degree).
When I change rotate(0 10 17) to rotate(90 10 17) it gets cut away. That's because i don't rotate it from the center of the image.
I tried using this formula to calculate the center but i couldn't manage to do it myself
x = 34; y = 20; o = 4.71238898 //(degrees to radiants) ;
a = Math.abs(x * Math.sin(o)) + Math.abs(y * Math.cos(o));
b = Math.abs(x * Math.cos(o)) + Math.abs(y * Math.sin(o));
I am really bad with those math/svg problems, i am hoping someone can assist me.
Thanks
The centre of rotation is correct. But the trouble is now that it is rotated, your graphic isn't 20x34 any longer, it is 34x20.
So the first thing you have to do is update the width and height.
<svg xmlns="http://www.w3.org/2000/svg" width="34" height="20">
That's not the final solution of course, because the centre of this new 34x20 SVG is in a different place to the centre of the old 20x34 one. One solution would be to work out a different centre of rotation that would work so that the graphic rotated around into the right position in the new rectangle.
That's a bit tricky. Luckily there is a much simpler way. We can just add a viewBox to the SVG to tell the browser the dimensions of the area that the rotated symbol occupies. The browser will reposition it for us.
The four values in a viewBox attribute are:
<leftX> <topY> <width> <height>
We already know the width and height (34 and 20), so we just need to work out the left and top coords. Those are obviously just the centre-of-rotation less half the width and height respectively.
leftX = 10 - (newWidth / 2)
= 10 - 17
= -7
topY = 17 - (newHeight / 2)
= 17 - 10
= 7
So the viewBox attribute needs to be "-7 7 34 20".
<svg xmlns="http://www.w3.org/2000/svg" width="34" height="20" viewBox="-7 7 34 20">
<path transform="rotate(90 10 17)" fill="#6699FF" d="M 0.06,25.03 C 0.06,25.03 4.06,20.97 4.06,20.97 4.06,20.97 3.97,5.06 3.97,5.06 3.97,5.06 9.94,0.03 9.94,0.03 9.94,0.03 15.78,5.09 15.78,5.09 15.78,5.09 16.03,21.06 16.03,21.06 16.03,21.06 19.94,25.00 19.94,25.00 19.94,25.00 19.97,30.09 19.97,30.09 19.97,30.09 10.06,20.97 10.06,20.97 10.06,20.97 10.05,23.00 10.05,23.00 10.05,23.00 19.91,32.03 19.91,32.03 19.91,32.03 19.94,33.97 19.94,33.97 19.94,33.97 17.97,33.97 17.97,33.97 17.97,33.97 10.03,26.94 10.03,26.94 10.03,26.94 10.05,29.01 10.05,29.01 10.05,29.01 15.94,34.00 15.94,34.00 15.94,34.00 12.00,33.94 12.00,33.94 12.00,33.94 10.00,32.00 10.00,32.00 10.00,32.00 8.00,34.03 8.00,34.03 8.00,34.03 4.00,34.00 4.00,34.00 4.00,34.00 9.97,29.01 9.97,29.01 9.97,29.01 9.97,26.94 9.97,26.94 9.97,26.94 2.00,33.96 2.00,33.96 2.00,33.96 0.00,33.98 0.00,33.98 0.00,33.98 0.02,32.00 0.02,32.00 0.02,32.00 9.96,23.00 9.96,23.00 9.96,23.00 9.95,20.98 9.95,20.98 9.95,20.98 0.00,30.00 0.00,30.00 0.00,30.00 0.06,25.03 0.06,25.03 Z" />
<path transform="rotate(90 10 17) translate(0 0)" fill="white" d="M 8.04,17.04 C 8.04,17.04 7.98,11.04 7.98,11.04 7.98,11.04 5.00,11.00 5.00,11.00 5.00,11.00 10.00,6.04 10.00,6.04 10.00,6.04 14.96,11.02 14.96,11.02 14.96,11.02 12.02,11.04 12.02,11.04 12.02,11.04 12.02,17.04 12.02,17.04 12.02,17.04 8.04,17.04 8.04,17.04 Z" />
</svg>
Update
If you need to do arbitrary angles, then there is a better method, using Javascript.
Apply the transform to the paths
Call getBBox() on the SVG to get the dimensions of its content.
Use those values to update the viewBox and the width and 'height`
var angle = 145; // degrees
var mysvg = document.getElementById("mysvg");
var paths = mysvg.getElementsByTagName("path");
// Apply a transform attribute to each path
for (var i=0; i<paths.length; i++) {
paths[i].setAttribute("transform", "rotate("+angle+",10,17)");
}
// Now that the paths have been rotated, get the bounding box
// of the SVG contents
var bbox = mysvg.getBBox();
// Update the viewBox from the bounds
mysvg.setAttribute("viewBox", bbox.x + " " + bbox.y + " " +
bbox.width +" "+ bbox.height);
// Update the width and height
mysvg.setAttribute("width", bbox.width + "px");
mysvg.setAttribute("height", bbox.height + "px");
<svg id="mysvg" xmlns="http://www.w3.org/2000/svg" width="20" height="34">
<path fill="#6699FF" d="M 0.06,25.03 C 0.06,25.03 4.06,20.97 4.06,20.97 4.06,20.97 3.97,5.06 3.97,5.06 3.97,5.06 9.94,0.03 9.94,0.03 9.94,0.03 15.78,5.09 15.78,5.09 15.78,5.09 16.03,21.06 16.03,21.06 16.03,21.06 19.94,25.00 19.94,25.00 19.94,25.00 19.97,30.09 19.97,30.09 19.97,30.09 10.06,20.97 10.06,20.97 10.06,20.97 10.05,23.00 10.05,23.00 10.05,23.00 19.91,32.03 19.91,32.03 19.91,32.03 19.94,33.97 19.94,33.97 19.94,33.97 17.97,33.97 17.97,33.97 17.97,33.97 10.03,26.94 10.03,26.94 10.03,26.94 10.05,29.01 10.05,29.01 10.05,29.01 15.94,34.00 15.94,34.00 15.94,34.00 12.00,33.94 12.00,33.94 12.00,33.94 10.00,32.00 10.00,32.00 10.00,32.00 8.00,34.03 8.00,34.03 8.00,34.03 4.00,34.00 4.00,34.00 4.00,34.00 9.97,29.01 9.97,29.01 9.97,29.01 9.97,26.94 9.97,26.94 9.97,26.94 2.00,33.96 2.00,33.96 2.00,33.96 0.00,33.98 0.00,33.98 0.00,33.98 0.02,32.00 0.02,32.00 0.02,32.00 9.96,23.00 9.96,23.00 9.96,23.00 9.95,20.98 9.95,20.98 9.95,20.98 0.00,30.00 0.00,30.00 0.00,30.00 0.06,25.03 0.06,25.03 Z" />
<path fill="white" d="M 8.04,17.04 C 8.04,17.04 7.98,11.04 7.98,11.04 7.98,11.04 5.00,11.00 5.00,11.00 5.00,11.00 10.00,6.04 10.00,6.04 10.00,6.04 14.96,11.02 14.96,11.02 14.96,11.02 12.02,11.04 12.02,11.04 12.02,11.04 12.02,17.04 12.02,17.04 12.02,17.04 8.04,17.04 8.04,17.04 Z" />
</svg>

Wordnet Find Synonyms

I am searching for a way to find all the synonyms of a particular word using wordnet. I am using JAWS.
For example:
love(v): admire, adulate, be attached to, be captivated by, be crazy about, be enamored of, be enchanted by, be fascinated with, be fond of, be in love with, canonize, care for, cherish, choose, deify, delight in, dote on, esteem, exalt, fall for, fancy, glorify, go for, gone on....
love(n):
Synonym : adulation, affection, allegiance, amity, amorousness, amour, appreciation, ardency, ardor, attachment, case*, cherishing, crush, delight, devotedness, devotion, emotion, enchantment, enjoyment, fervor, fidelity, flame, fondness, friendship, hankering, idolatry, inclination, infatuation, involvement
In a related question user Ram has pointed to some code but that does not suffice as it just gives a vastly different output:
love, passion: any object of warm affection or devotion beloved, dear, dearest, honey,
love: a beloved person; used as terms of endearment
love, sexual love, erotic love: a deep feeling of sexual desire and attraction
love: a score of zero in tennis or squash
sexual love, lovemaking, making love, love, love life: sexual activities (often including sexual intercourse) between two people
love: have a great affection or liking for
So how do I achieve it and is wordnet suited for what I want to do?
Sticking with just WordNet, you could try to use semantic similarity to determine if two words (synsets) are similar enough to be synonyms. Below is a quick example that came from modifying another of my answers on semantic similarity using WordNet.
It does have its problems though:
Antonyms are mixed in with synonyms
It is slow! (as it has to check all ~117k synsets)
Still, it produces more synonyms than using lemma_names alone, so I leave it here in case it might be useful (in conjunction with something else perhaps).
>>> from nltk.corpus import wordnet as wn
>>> def syn(word, lch_threshold=2.26):
for net1 in wn.synsets(word):
for net2 in wn.all_synsets():
try:
lch = net1.lch_similarity(net2)
except:
continue
# The value to compare the LCH to was found empirically.
# (The value is very application dependent. Experiment!)
if lch >= lch_threshold:
yield (net1, net2, lch)
>>> for x in syn('love'):
print x
Code above outputs:
(Synset('love.n.01'), Synset('feeling.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('conditioned_emotional_response.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('emotion.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('worship.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('anger.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('fear.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('fear.n.03'), 2.538973871058276)
(Synset('love.n.01'), Synset('anxiety.n.02'), 2.538973871058276)
(Synset('love.n.01'), Synset('joy.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('love.n.01'), 3.6375861597263857)
(Synset('love.n.01'), Synset('agape.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('agape.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('filial_love.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('ardor.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('amorousness.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('puppy_love.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('devotion.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('benevolence.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('beneficence.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('heartstrings.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('lovingness.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('warmheartedness.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('loyalty.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('hate.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('emotional_state.n.01'), 2.538973871058276)
(Synset('love.n.02'), Synset('content.n.05'), 2.538973871058276)
(Synset('love.n.02'), Synset('object.n.04'), 2.9444389791664407)
(Synset('love.n.02'), Synset('antipathy.n.02'), 2.538973871058276)
(Synset('love.n.02'), Synset('bugbear.n.02'), 2.538973871058276)
(Synset('love.n.02'), Synset('execration.n.03'), 2.538973871058276)
(Synset('love.n.02'), Synset('center.n.06'), 2.538973871058276)
(Synset('love.n.02'), Synset('hallucination.n.03'), 2.538973871058276)
(Synset('love.n.02'), Synset('infatuation.n.03'), 2.538973871058276)
(Synset('love.n.02'), Synset('love.n.02'), 3.6375861597263857)
(Synset('beloved.n.01'), Synset('person.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('lover.n.01'), 2.9444389791664407)
(Synset('beloved.n.01'), Synset('admirer.n.03'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('beloved.n.01'), 3.6375861597263857)
(Synset('beloved.n.01'), Synset('betrothed.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('boyfriend.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('darling.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('girlfriend.n.02'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('idolizer.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('inamorata.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('inamorato.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('kisser.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('necker.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('petter.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('romeo.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('soul_mate.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('squeeze.n.04'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('sweetheart.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('desire.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('sexual_desire.n.01'), 2.9444389791664407)
(Synset('love.n.04'), Synset('love.n.04'), 3.6375861597263857)
(Synset('love.n.04'), Synset('aphrodisia.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('anaphrodisia.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('passion.n.05'), 2.538973871058276)
(Synset('love.n.04'), Synset('sensuality.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('amorousness.n.02'), 2.538973871058276)
(Synset('love.n.04'), Synset('fetish.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('libido.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('lecherousness.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('nymphomania.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('satyriasis.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('the_hots.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('bowling_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('football_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('baseball_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('basketball_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('number.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('score.n.03'), 2.9444389791664407)
(Synset('love.n.05'), Synset('stroke.n.06'), 2.538973871058276)
(Synset('love.n.05'), Synset('birdie.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('bogey.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('deficit.n.03'), 2.538973871058276)
(Synset('love.n.05'), Synset('double-bogey.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('duck.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('eagle.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('double_eagle.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('game.n.06'), 2.538973871058276)
(Synset('love.n.05'), Synset('lead.n.07'), 2.538973871058276)
(Synset('love.n.05'), Synset('love.n.05'), 3.6375861597263857)
(Synset('love.n.05'), Synset('match.n.05'), 2.538973871058276)
(Synset('love.n.05'), Synset('par.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('bondage.n.03'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('outercourse.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('safe_sex.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('sexual_activity.n.01'), 2.9444389791664407)
(Synset('sexual_love.n.02'), Synset('conception.n.02'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('sexual_intercourse.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('pleasure.n.05'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('sexual_love.n.02'), 3.6375861597263857)
(Synset('sexual_love.n.02'), Synset('carnal_abuse.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('coupling.n.03'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('reproduction.n.05'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('foreplay.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('perversion.n.02'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('autoeroticism.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('promiscuity.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('lechery.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('homosexuality.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('bisexuality.n.02'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('heterosexuality.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('bestiality.n.02'), 2.538973871058276)
# ...
First we have to ask the questions "What is synonym?", "Can synonyms be queried from the surface/root word?".
In WordNet, you have similar words representing the same concept under this term call the Synset and not at the surface word level.
To get synset's synonyms in the coverage of your example, you would require more than wordnet, possibly some semantic similarity methods to extract the other words.
I couldn't give you a JAWS explanation of what i mean above but from WordNet in NLTK interface for python. You can see that WN is insufficient for the coverage you want.
from nltk.corpus import wordnet as wn
for ss in wn.synsets('love'): # Each synset represents a diff concept.
print ss.definition
print ss.lemma_names
print
Code above outputs:
a strong positive emotion of regard and affection
['love']
any object of warm affection or devotion;
['love', 'passion']
a beloved person; used as terms of endearment
['beloved', 'dear', 'dearest', 'honey', 'love']
a deep feeling of sexual desire and attraction
['love', 'sexual_love', 'erotic_love']
a score of zero in tennis or squash
['love']
sexual activities (often including sexual intercourse) between two people
['sexual_love', 'lovemaking', 'making_love', 'love', 'love_life']
have a great affection or liking for
['love']
get pleasure from
['love', 'enjoy']
be enamored or in love with
['love']
have sexual intercourse with
['sleep_together', 'roll_in_the_hay', 'love', 'make_out', 'make_love', 'sleep_with', 'get_laid', 'have_sex', 'know', 'do_it', 'be_intimate', 'have_intercourse', 'have_it_away', 'have_it_off', 'screw', 'fuck', 'jazz', 'eff', 'hump', 'lie_with', 'bed', 'have_a_go_at_it', 'bang', 'get_it_on', 'bonk']
I have looked at the RiWordNet (RiTa) code (see getAllSynonyms method) and found that it generates synonyms by giving you all lemmas of synsets, hyponyms, similar tos, also sees, and coordinate names. I didn't include coordinate names but added antonyms. Furthermore, I added synset names and "synonym type" to my data so that I could make use of other Wordnet data like definition and/or examples. Here's my code in python and the output:
'''Synonym generator using NLTK WordNet Interface: http://www.nltk.org/howto/wordnet.html
'ss': synset
'hyp': hyponym
'sim': similar to
'ant': antonym
'also' also see
'''
from nltk.corpus import wordnet as wn
def get_all_synsets(word, pos=None):
for ss in wn.synsets(word):
for lemma in ss.lemma_names():
yield (lemma, ss.name())
def get_all_hyponyms(word, pos=None):
for ss in wn.synsets(word, pos=pos):
for hyp in ss.hyponyms():
for lemma in hyp.lemma_names():
yield (lemma, hyp.name())
def get_all_similar_tos(word, pos=None):
for ss in wn.synsets(word):
for sim in ss.similar_tos():
for lemma in sim.lemma_names():
yield (lemma, sim.name())
def get_all_antonyms(word, pos=None):
for ss in wn.synsets(word, pos=None):
for sslema in ss.lemmas():
for antlemma in sslema.antonyms():
yield (antlemma.name(), antlemma.synset().name())
def get_all_also_sees(word, pos=None):
for ss in wn.synsets(word):
for also in ss.also_sees():
for lemma in also.lemma_names():
yield (lemma, also.name())
def get_all_synonyms(word, pos=None):
for x in get_all_synsets(word, pos):
yield (x[0], x[1], 'ss')
for x in get_all_hyponyms(word, pos):
yield (x[0], x[1], 'hyp')
for x in get_all_similar_tos(word, pos):
yield (x[0], x[1], 'sim')
for x in get_all_antonyms(word, pos):
yield (x[0], x[1], 'ant')
for x in get_all_also_sees(word, pos):
yield (x[0], x[1], 'also')
for x in get_all_synonyms('love'):
print x
The output for 'love' and 'brave':
Love
(u'love', u'love.n.01', 'ss')
(u'love', u'love.n.02', 'ss')
(u'passion', u'love.n.02', 'ss')
(u'beloved', u'beloved.n.01', 'ss')
(u'dear', u'beloved.n.01', 'ss')
(u'dearest', u'beloved.n.01', 'ss')
(u'honey', u'beloved.n.01', 'ss')
(u'love', u'beloved.n.01', 'ss')
(u'love', u'love.n.04', 'ss')
(u'sexual_love', u'love.n.04', 'ss')
(u'erotic_love', u'love.n.04', 'ss')
(u'love', u'love.n.05', 'ss')
(u'sexual_love', u'sexual_love.n.02', 'ss')
(u'lovemaking', u'sexual_love.n.02', 'ss')
(u'making_love', u'sexual_love.n.02', 'ss')
(u'love', u'sexual_love.n.02', 'ss')
(u'love_life', u'sexual_love.n.02', 'ss')
(u'love', u'love.v.01', 'ss')
(u'love', u'love.v.02', 'ss')
(u'enjoy', u'love.v.02', 'ss')
(u'love', u'love.v.03', 'ss')
(u'sleep_together', u'sleep_together.v.01', 'ss')
(u'roll_in_the_hay', u'sleep_together.v.01', 'ss')
(u'love', u'sleep_together.v.01', 'ss')
(u'make_out', u'sleep_together.v.01', 'ss')
(u'make_love', u'sleep_together.v.01', 'ss')
(u'sleep_with', u'sleep_together.v.01', 'ss')
(u'get_laid', u'sleep_together.v.01', 'ss')
(u'have_sex', u'sleep_together.v.01', 'ss')
(u'know', u'sleep_together.v.01', 'ss')
(u'do_it', u'sleep_together.v.01', 'ss')
(u'be_intimate', u'sleep_together.v.01', 'ss')
(u'have_intercourse', u'sleep_together.v.01', 'ss')
(u'have_it_away', u'sleep_together.v.01', 'ss')
(u'have_it_off', u'sleep_together.v.01', 'ss')
(u'screw', u'sleep_together.v.01', 'ss')
(u'fuck', u'sleep_together.v.01', 'ss')
(u'jazz', u'sleep_together.v.01', 'ss')
(u'eff', u'sleep_together.v.01', 'ss')
(u'hump', u'sleep_together.v.01', 'ss')
(u'lie_with', u'sleep_together.v.01', 'ss')
(u'bed', u'sleep_together.v.01', 'ss')
(u'have_a_go_at_it', u'sleep_together.v.01', 'ss')
(u'bang', u'sleep_together.v.01', 'ss')
(u'get_it_on', u'sleep_together.v.01', 'ss')
(u'bonk', u'sleep_together.v.01', 'ss')
(u'agape', u'agape.n.01', 'hyp')
(u'agape', u'agape.n.02', 'hyp')
(u'agape_love', u'agape.n.02', 'hyp')
(u'amorousness', u'amorousness.n.01', 'hyp')
(u'enamoredness', u'amorousness.n.01', 'hyp')
(u'ardor', u'ardor.n.02', 'hyp')
(u'ardour', u'ardor.n.02', 'hyp')
(u'benevolence', u'benevolence.n.01', 'hyp')
(u'devotion', u'devotion.n.01', 'hyp')
(u'devotedness', u'devotion.n.01', 'hyp')
(u'filial_love', u'filial_love.n.01', 'hyp')
(u'heartstrings', u'heartstrings.n.01', 'hyp')
(u'lovingness', u'lovingness.n.01', 'hyp')
(u'caring', u'lovingness.n.01', 'hyp')
(u'loyalty', u'loyalty.n.02', 'hyp')
(u'puppy_love', u'puppy_love.n.01', 'hyp')
(u'calf_love', u'puppy_love.n.01', 'hyp')
(u'crush', u'puppy_love.n.01', 'hyp')
(u'infatuation', u'puppy_love.n.01', 'hyp')
(u'worship', u'worship.n.02', 'hyp')
(u'adoration', u'worship.n.02', 'hyp')
(u'adore', u'adore.v.01', 'hyp')
(u'care_for', u'care_for.v.02', 'hyp')
(u'cherish', u'care_for.v.02', 'hyp')
(u'hold_dear', u'care_for.v.02', 'hyp')
(u'treasure', u'care_for.v.02', 'hyp')
(u'dote', u'dote.v.02', 'hyp')
(u'love', u'love.v.03', 'hyp')
(u'get_off', u'get_off.v.06', 'hyp')
(u'romance', u'romance.v.02', 'hyp')
(u'fornicate', u'fornicate.v.01', 'hyp')
(u'take', u'take.v.35', 'hyp')
(u'have', u'take.v.35', 'hyp')
(u'hate', u'hate.n.01', 'ant')
(u'hate', u'hate.v.01', 'ant')
Brave
(u'brave', u'brave.n.01', 'ss')
(u'brave', u'brave.n.02', 'ss')
(u'weather', u'weather.v.01', 'ss')
(u'endure', u'weather.v.01', 'ss')
(u'brave', u'weather.v.01', 'ss')
(u'brave_out', u'weather.v.01', 'ss')
(u'brave', u'brave.a.01', 'ss')
(u'courageous', u'brave.a.01', 'ss')
(u'audacious', u'audacious.s.01', 'ss')
(u'brave', u'audacious.s.01', 'ss')
(u'dauntless', u'audacious.s.01', 'ss')
(u'fearless', u'audacious.s.01', 'ss')
(u'hardy', u'audacious.s.01', 'ss')
(u'intrepid', u'audacious.s.01', 'ss')
(u'unfearing', u'audacious.s.01', 'ss')
(u'brave', u'brave.s.03', 'ss')
(u'braw', u'brave.s.03', 'ss')
(u'gay', u'brave.s.03', 'ss')
(u'desperate', u'desperate.s.04', 'sim')
(u'heroic', u'desperate.s.04', 'sim')
(u'gallant', u'gallant.s.01', 'sim')
(u'game', u'game.s.02', 'sim')
(u'gamy', u'game.s.02', 'sim')
(u'gamey', u'game.s.02', 'sim')
(u'gritty', u'game.s.02', 'sim')
(u'mettlesome', u'game.s.02', 'sim')
(u'spirited', u'game.s.02', 'sim')
(u'spunky', u'game.s.02', 'sim')
(u'lionhearted', u'lionhearted.s.01', 'sim')
(u'stalwart', u'stalwart.s.03', 'sim')
(u'stouthearted', u'stalwart.s.03', 'sim')
(u'undaunted', u'undaunted.s.02', 'sim')
(u'valiant', u'valiant.s.01', 'sim')
(u'valorous', u'valiant.s.01', 'sim')
(u'bold', u'bold.a.01', 'sim')
(u'colorful', u'colorful.a.02', 'sim')
(u'colourful', u'colorful.a.02', 'sim')
(u'timid', u'timid.n.01', 'ant')
(u'cowardly', u'cowardly.a.01', 'ant')
(u'adventurous', u'adventurous.a.01', 'also')
(u'adventuresome', u'adventurous.a.01', 'also')
(u'bold', u'bold.a.01', 'also')
(u'resolute', u'resolute.a.01', 'also')
(u'unafraid', u'unafraid.a.01', 'also')
(u'fearless', u'unafraid.a.01', 'also')
How about using RitaWN instead of JAWS? I see it is listed as one of the available API in http://wordnet.princeton.edu/wordnet/related-projects/#Java
I also see it has getAllSynonyms() method, they provide an example and it works.
Check this out:
import java.io.IOException;
import java.util.Arrays;
import rita.*;
public class Synonyms {
public static void main(String[] args) throws IOException {
// Would pass in a PApplet normally, but we don't need to here
RiWordNet wordnet = new RiWordNet("C:\\Program Files (x86)\\WordNet\\2.1");
// Get a random noun
String word = "love";//wordnet.getRandomWord("n");
// Get max 15 synonyms
String[] synonyms = wordnet.getAllSynonyms(word, "v");
System.out.println("Random noun: " + word);
if (synonyms != null) {
// Sort alphabetically
Arrays.sort(synonyms);
for (int i = 0; i < synonyms.length; i++) {
System.out.println("Synonym " + i + ": " + synonyms[i]);
}
} else {
System.out.println("No synyonyms!");
}
}
Make sure to get the latest downloads and documentation from http://www.rednoise.org/rita/wordnet-old/documentation/index.htm

Resources